blankline

NodeBots Day 2015 – Medellin

El sábado pasado tuve la oportunidad de participar del NodeBots Day 2015 en Medellin @NodeBotsMed. Fue un día realmente muy divertido y el primero de estos eventos en los que participo.

Venue

El lugar donde se realizó el evento me sorprendió gratamente. El edificio de RutaN es bellísimo. Con el estilo que se repite en todo Medellín, esta coloreado de mucho verde y naturaleza.

NodeBots Day Medellin - RutaN

Introduccion

Siguiendo el espíritu de NodeBots Day no hubo una charla formal sino mas bien una introducción por parte de nuestro facilitador @Julian_Duque que nos contó un poco sobre los origenes de Javascript Robotics, un poco de la Serial Port library y el modulo Johnny-Five.

Manos a la obra

Rapidamente nos pusimos todos manos a la obra y provistos de unos SumoBots kit comenzamos armando los sumobots. Los Sumobots son unos modelos open source especialmente diseñados para los NodeBots Day, fáciles de construir, de ensamblar y muy económicos. Podrá parecerles exagerado, pero hacía mucho tiempo que no realizaba trabajos manuales con tecnología que ya nos resulta desconocida: Cables, Tornillos, Taladros, Trinchetas, Pegamento, Lijas!!

(Source: MeetupJS)

Firmware y Software

Para proveer un poco de firmware a los NodeBots y programarlos para jugar un rato pudimos disponer de unos Dev Kits de ElectricImp y usamos el Imp IDE para flashear los modulitos. Esto ultimo, toda una experiencia aparte que hace sonreír al niño nerd interior que todos tenemos!

Internet of Things

ElectricImp hace muy facil la implementacion de una solucion de IoT. Basicamente permite conectar los modulos (Devices) via WIFI con la nube de ‘Agents’ de ElectricImp. Para esto usamos el Imp IDE para subir las librerías tyrion y imp-io para poder manejar los NodeBots remotamente.

 

ElectricImp Architecture

Sumo

Que puede ser más entretenido de hacer un sábado por la tarde que unas luchas de sumo con NodeBots manejados remotamente desde la nube. No hay nada mas que decir, dejo alguna fotos subidas por algunos usuarios a la pagina del Meetup NodeBots

Los nuevos amigos del equipo de SumoBots: Carlos Alberto Mesa Rivillas, Elias Quintero y Alejandro Berrío.

NodeBots Day Medellin - Equipo

Mas para leer

featured-image

Introduction to Azure Data Factory

We live in a world where data is coming at us from everywhere. IoT is evolving so quickly that right now it seems almost every device is capable of producing valuable information (from water quality sensors to smartwatches). At the same time, the amount of data collected is growing exponentially in volume, variety, and complexity, making the process of learning useful information about Terabytes of data stored in different places (data sources from a variety of geographic locations) a complex scenario that requires the creation of custom logic that has to be maintained and updated over time.

There are several tools and services nowadays that are used to simplify the process of extracting, transforming and loading (ETL) data from different (and most likely) heterogeneous data sources into a single source: an Enterprise Data Warehouse (EDW). Their goal is to obtain meaningful business information (insights) that could help improve products and make decisions.

In this post, we are going to explore Azure Data Factory, the Microsoft cloud service for performing ETL operations to compose streamlined data pipelines that can be later consumed by BI tools or monitored to pinpoint issues and take corrective actions.

Azure Data Factory

Azure Data Factory is a fully managed service that merges the traditional EDWs with other modern Big Data scenarios like Social feeds (Twitter, Facebook), device information and other IoT scenarios. This service lets you:

  • Easily work with diverse data storage and processing systems, meaning you can process both on-premises data (like a SQL Server) and cloud data sources such as Azure SQL Database, Blob, Tables, HDInsight, etc.
  • Transform data into trusted information, via Hive, Pig and custom C# code activities that can be fully managed by Data Factory on our behalf (meaning that, for instance, no manual Hadoop cluster setup or management is required)
  • Monitor data pipelines in one place. For this, you can use an up-to-the-moment monitoring dashboard to quickly assess end-to-end data pipeline health, pinpoint issues, and take corrective action if needed.
  • Get rich insights from transformed data. You can create data pipelines that produce trusted data, which can be later consumed by BI and analytic tools.

1

Now that we know the basics let’s see each of these features in a real scenario. For this, we are going to use the Gaming customer profiling sample pipeline provided in the Azure Preview Portal. You can easily deploy this Data Factory in your own Azure subscription following this tutorial and explore it using the Azure Preview Portal. For instance, this is the Data Factory diagram of this sample (you can visualize it by clicking the Diagram tile inside the Data Factory blade):

2

The following is a brief description of the sample:

“Contoso is a gaming company that creates games for multiple platforms: game consoles, hand held devices, and personal computers (PCs). Each of these games produces tons of logs. Contoso’s goal is to collect and analyze the logs produced by these games to get usage information, identify up-sell and cross-sell opportunities, develop new compelling features, etc. to improve business and provide better experience to customers. This sample collects sample logs, processes and enriches them with reference data, and transforms the data to evaluate the effectiveness of a marketing campaign that Contoso has recently launched. “

Easily work with diverse data storage and processing systems

Azure Data Factory currently supports the following data sources: Azure Storage (Blob and Tables), Azure SQL, Azure DocumentDB, On-premises SQL Server, On-premises Oracle, On-premises File System, On-premises MySQL, On-premises DB2, On-premises Teradata, On-premises Sybase and On-premises PostgreSQL.

For instance, the Data Factory sample combines information from Azure Blob Storage:

3

Transform data into trusted information

Azure Data Factory currently supports the following activities: Copy Activity (on-premises to cloud, and cloud to on-premises), HDInsight Activity (Pig, Hive, MapReduce, Hadoop Streaming transformations), Azure Machine Learning Batch Scoring Activity, Azure SQL Stored Procedure activity, Custom .NET activities.

In the Data Factory sample, one of the pipelines executes 2 activities: an HDInsight Hive Activity to bring data from 2 different blob storage tables into a single blob storage table and a Copy Activity to copy the results of the previous activity (in an Azure Blob) to an Azure SQL Database.

4

Monitor data pipelines in one place

You can use the Azure Preview Portal to view details about the Data Factory resource, like linked services, datasets and their details, the latest runs of the activities and their status, etc. You can also configure the resource to send notifications when an operation is complete or has failed (more details here)

5

Get rich insights from transformed data

You can use data pipelines to deliver transformed data from the cloud to on-premises sources like SQL Server, or keep it in your cloud storage sources for consumption by BI tools and other applications.

In this sample we collect log information and reference data that is then transformed to evaluate the effectiveness of marketing campaigns, as seen in the image below:

6

Next steps

sample experiment

Twitter Sentiment Analysis

> Note: If you are not familiar with machine learning you can start with this post which explains the basic concepts of Machine Learning and the Azure Machine Learning service.

The purpose of this post is to explain how to build an experiment for sentiment analysis using Azure Machine Learning and then publish it to a public API that can be consumed by any application that needs to use this feature for a particular business scenario (e.g. gather user’s opinions about a product or brand, etc.). Since there is already a Text Analytics API in the Azure Marketplace in English, we decided to create it in Spanish. And to simplify things, we used the sample Twitter Sentiment analysis experiment available in the Azure Machine Learning Gallery.

Creating a custom dataset

This is our greatest challenge: create a valid dataset but with Spanish content. There is an existing dataset used in the sample experiment we are going to use as a basis for our work, which you can find here. This experiment is based on an original dataset of 1,600,000 tweets classified as negative or positive. The Azure ML Studio sample dataset contains only 10% of this data (160,000 records). In supervised learning the more training data you have, the more accurate your trained model will be, and that’s why the first thing we want is a dataset with a considerable amount of data.

As this dataset is in English, the predictive model will learn to process English text. But since we want to create a service using the Spanish language, our data needs to be in Spanish.

To get the data in Spanish we could use Spanish tweets and manually classify them (which would take a long time) or use the original dataset translated to Spanish. In the latter option, the hard work of classifying the data is already done and we could use an automatic translation tool to do the work for us. Although automatic translation is not 100% accurate, the keywords will be there, so we’re going to go with this approach to make sure we have a good quantity of training data.

For this reason we created a very simple console application that uses the Bing Translate API to translate our dataset and return it in the correct format.

Once we have the dataset ready, the next step is to upload it to Azure ML studio so it is available to use in the experiments.

To upload the recently created dataset, in the Azure ML portal click NEW, select DATASET, and then click FROM LOCAL FILE. In the dialog box that opens, select the file you want to upload, type a name, and select the dataset type (this is usually inferred automatically). In our case, it is a TAB separated values file (.tsv).

uploading a new dataset

The data in the dataset contains only 2 columns, the sentiment_label, which is 0 for a negative sentiment and 4 for positive.

Sample input data

Once the dataset is created, we will take advantage of the existing sample experiment of the Machine Learning Gallery, available here.

Open the experiment by clicking Open in Studio as shown below.

sample experiment

Then, you will be prompted to copy the experiment from the Gallery to your workspace.

copying from gallery

At this point let’s remove the Reader module from the experiment and add the custom dataset we created. Connect the dataset to the Execute R Script module.

Run the experiment.

Running the experiment

Pre-processing the data

This experiment uses several modules to pre-process the data before analyzing its content (like removing punctuation marks or special characters, or adjusting the data to fit the algorithm used). For more information about the data preprocessing, you can read the information available in the experiment page in the Gallery.

Scoring the model

After running the predictive experiment, let’s create the scoring model. To do this, point to SET UP WEB SERVICE and select Predictive Web Service [Recommended].

setting up the web service

Once the Predictive Experiment is created, we need to update this experiment to make it work as expected. First delete the Filter Based Feature Selection Module and reconnect the Feature Hashing module to the Score Model module.

Delete the connection between the Score Model module and the Web Service Output module by right-clicking it and clicking Delete.

deleting a conection

Between those two modules, add a Project Columns module, and then an Execute R Script module. Connect them in sequence and also with the Web Service Output module. The resulting experiment will resemble the following image.

resulting experiment

Now let’s configure the Project Columns module. Select it and in the Properties pane, click Launch column selector. In the dialog box that opens, in the row with the Include dropdown, go to the text field and add the four available columns (sentiment_label, tweet_text, Scored Labels, and Scored Probabilities).

projecting columns

Lastly, select the Execute R Script to configure it. Click inside the R Script text box and replace the existing script with the following:

# Map 1-based optional input ports to variables
dataset1 <- maml.mapInputPort(1) # class: data.frame


#set thresholds for classification
threshold1 <- 0.60
threshold2 <- 0.45
positives <- which(dataset1["Scored Probabilities"] > threshold1)
negatives <- which(dataset1["Scored Probabilities"] < threshold2)
neutrals <- which(dataset1["Scored Probabilities"] <= threshold1 &
dataset1["Scored Probabilities"] >= threshold2)


new.labels <- matrix(nrow=length(dataset1["Scored Probabilities"]),
ncol=1)
new.labels[positives] <- "positive"
new.labels[negatives] <- "negative"
new.labels[neutrals] <- "neutral"


data.set <- data.frame(assigned=new.labels,
confidence=dataset1["Scored Probabilities"])
colnames(data.set) <- c('Sentiment', 'Score')


# Select data.frame to be sent to the output Dataset port
maml.mapOutputPort("data.set");

This will return two columns as the output of the service: Sentiment and Score.

The sentiment column will be returned as Positive, Neutral or Negative and the Score column will be the Score Probability. The classification will be made based on the defined thresholds and will fall into the following 3 categories:

  • Less than 0.45: Negative
  • Between 0.45 and 0.60: Neutral
  • Above 0.60: Positive

Now that everything is set up, we can run the experiment.

Publishing and Testing the Web Service

Once the predictive experiment finishes, click Deploy Web Service. The deployed service screen will appear. Click Test.

published web service

In the Enter data to predict dialog box, enter a text in Spanish in the TWEET_TEXT parameter and click the check mark button.

entering data to predict

Wait for the web service to predict the results, which will be shown as an alert.

prediction results

We have made the following test page that uses our generated API to test the service.

testing app

Next Steps

We tested the resulting API with some sample text, and we are pleased with the outcome (the model learned how to classify Spanish texts quite well). Nevertheless, there are some ways to improve the model we have created, such as:

  • Trying other training algorithms and comparing their performance
  • Improving the input dataset, either by having a brand new dataset with manually classified information in Spanish or using common keywords for getting classified results.

Given that this is a proof of concept, we consider this to be a successful experiment.

blankline

Docker Compose: Scaling Multi-Container Applications

Introduction

In the Docker Compose: Creating Multi-Container Applications blog post we talked about Docker Compose, the benefits it offers, and used it to create a multi-container application. Then, in the Running a .NET application as part of a Docker Compose in Azure blog post, we explained how to create a multi-container application composed by a .NET web application and a Redis service. So far, so good.

However, although we can easily get multi-container applications up and running using Docker Compose, in real environments (e.g. production) we need to ensure that our application will continue responding even if it receives numerous requests. In order to achieve this, those in charge of configuring the environment usually create multiple instances of the web application and set up a load balancer in front of them. So, the question here is: Could we do this using Docker Compose? Fortunately, Docker Compose offers a really simple way to create multiple instances of any of the services defined in the Compose.

Please notice that although Docker Compose is not considered production ready yet, the goal of this post is to show how a particular service can be easily scaled using this feature so you know how to do it when the final version is released.

Running applications in more than one container with “scale”

Docker Compose allows you to generate multiple containers for your services running the same image. Using the “scale” command in combination with a load balancer, we can easily configure scalable applications.

The “scale” command sets the number of containers to run for a particular service. For example, if we want to run a front end web application in 10 different containers we would use this command.

Considering the scenario we worked on in the Running a .NET application as part of a Docker Compose in Azure blog post, how could we scale the .NET web application service to run in 3 different containers at the same time? Let’s see…

Check/update the docker-compose.yml file

The first thing we need to do is ensure that the service we want to scale does not specify the external/host port. If we specify that port, the service cannot be scaled since all the instances would try to use the same host port. So, we just need to make sure that the service we want to scale only defines the private port in order to let Docker choose a random host port when the container instances are created.

But, how do we specify only the private port? The port value can be configured as follows:

  • If we want to specify the external/host port and the private port, the “ports” configuration would look like this:
    “<external-port>:<private-port>”
  • If we want to specify only the private port, this would be the “ports” configuration:
    “<private-port>”

In our scenario, we want to scale the .NET web application service called “net“; therefore, that service should not specify the external port. As you can see in our docker-compose.yml file displayed below, the ports specification for the “net” service only contains one port, which is the private one. So, we are good to go.

net:

  image: websiteondocker

  ports:

   – “210”

  links:

   – redis

redis:

  image: redis

 

Remember that the private port we specify here must be the one we provided when we published the .NET application from Visual Studio since the application is configured to work on that port.

Scaling a service

Now that we have the proper configuration in the docker-compose.yml file, we are ready to scale the web application.

If we don’t have our Compose running or have modified the docker-compose.yml file, we would need to recreate the Compose by running “docker-compose up -d“.

Once we have the Compose running, let’s check the containers we have running as part of the Compose by executing “docker-compose ps“:

clip_image002

As you can see, there is one container running that corresponds to the “net” service (.NET web application) and another container corresponding to the Redis service.

Now, let’s scale our web application to run in 3 containers. To do this, we just need to run the scale command as follows:

docker-compose scale net=3

 

In the previous command, “net” is the name of the service that we want to scale and “3” is the amount of instances we want. As a result of running this command, 2 new containers running the .NET web application will be created.

clip_image002[7]

If we check the Docker Compose containers now, we’ll see the new ones:

clip_image003

We need to consider that Docker Compose remembers the amount of instances set in the scale command. So, from now on, every time we run “docker-compose up -d” to recreate the Compose, 3 containers running the .NET web application will be created. If we only want 1 instance of the web application again, we can run “docker-compose scale net=1“. In this case, Docker Compose will delete the extra containers.

At this point, we have 3 different containers running the .NET web application. But, how hard would it be to add a load balancer in front of these containers? Well, adding a load balancer container is pretty easy.

Configuring a load balancer

There are different proxy images that offer the possibility of balancing the load between different containers. We tested one of them: tutum/haproxy.

When we created the .NET web application, we included logic to display the name of the machine where the requests are processed:

@{

    ViewBag.Title = “Home Page”;

}

<h3>Hits count: @ViewBag.Message</h3>

<h3>Machine Name: @Environment.MachineName</h3>

 

So, once we set a load balancer in front of the 3 containers, the application should display different container IDs.

Let’s create the load balancer. In our scenario, we can create a new container using the tutum/haproxy image to balance the load between the web application containers by applying any of the following methods:

  • Manually start the load balancer container:
    We can manually start a container running the tutum/haproxy image by running the command displayed below. We would need to provide the different web app container names in order to indicate to the load balancer where it should send the requests.

docker run -d -p 80:80 –link <web-app-1-container-name>:<web-app-1-container-name> –link <web-app-2-container-name>:<web-app-2-container-name> … –link <web-app-N-container-name>:<web-app-N-container-name> tutum/haproxy

 

  • Include the load balancer configuration as part of the Docker Compose:
    We can update the docker-compose.yml file in order to include the tutum/haproxy configuration. This way, the load balancer would start when the Compose is created and the site would be accessible just by running one command. Below, you can see what the configuration corresponding to the load balancer service would look like. The “haproxy” service definition specifies a link to the “net” service. This is enough to let the load balancer know that it should distribute the requests between the instances of the “net” service, which correspond to the .NET web application.

haproxy:

  image: tutum/haproxy

  links:

   – net

  ports:

   – “80:80″

 

In our scenario, we will apply the second approach since it allows us to start the whole environment by running just one command. Although in general we think that it is better to include the load balancer configuration in the Compose configuration file, please keep in mind that starting the load balancer together with the rest of the Compose may not always be the best solution. For example, if you scaled the web application service adding new instances and you want the load balancer to start considering those instances without the site being down too long, restarting the load balancer container manually may be faster than recreating the whole compose.

Continuing with our example, let’s update the “docker-compose.yml” file to include the “haproxy” service configuration.

First, open the file:

vi docker-compose.yml

 

Once the file opens, press i (“Insert”) to start editing the file. Here, we will add the configuration corresponding to the “haproxy” service:

haproxy:

  image: tutum/haproxy

  links:

   – net

  ports:

   – “80:80″

net:

  image: websiteondocker

  ports:

   – “210”

  links:

   – redis

redis:

  image: redis

 

Finally, to save the changes we made to the file, just press Esc and then :wq (write and quit).

At this point, we are ready to recreate the Compose by running “docker-compose up -d“.

image

As you can see in the previous image, the existing containers were recreated and additionally, a new container corresponding to the “haproxy” service was created.

So, Docker Compose started the load balancer container, but is the site working? Let’s check it out!

First, let’s look at the container we have running:

image

As you can see, the load balancer container is up and running in port 80. So, since we already have an endpoint configured in our Azure VM for this port, let’s access the URL corresponding to our VM.

clip_image001

The site is running! Please notice that the Container ID is displayed on the page. Checking the displayed value against the result we got from the “docker ps” command, we can see that the request was processed by the “netcomposetest_net_3” container.

If we reload the page, this time the request should be processed by a different container.

clip_image002

This time, the request was processed by the “netcomposetest_net_4” container.

At this point we have validated that the .NET web application is running in different containers and that the load balancer is working. Plus, we have verified that all the containers are consuming information from the same Redis service instance since, as you can see, the amount of hits increased even when the requests were processed by different web application instances.

Now, what happens if we need to stop one of the web application containers? Do we need to stop everything? The answer is “No”. We can stop a container, and the load balancer will notice it and won’t send new requests to that container. The best thing here is that the site continues running!

Let’s validate this in our example. Since we have 3 web application containers running, we can stop 2 of them and then try to access the site.

To stop the containers, we can run the “docker stop <container-name>” command. Looking at the result we got from the “docker ps” command, we can see that our containers are called “netcomposetest_net_3“, “netcomposetest_net_4” and “netcomposetest_net_5“. Let’s stop the “netcomposetest_net_3” and “netcomposetest_net_4” containers.

clip_image003[1]

Now, if we reload the page, we will see that the site is still working!

clip_image004[1]

This time the request was processed by the only web application container we have running: “netcomposetest_net_5“.

If we keep reloading the page, we will see that all the requests are processed by this container.

clip_image005

blankline

Running a .NET application as part of a Docker Compose in Azure

Introduction

In the “Docker Compose: Creating Multi-Container Applications” blog post we talked about Docker Compose, the benefits it offers, and used it to create a multi-container application based on the example provided in the Quick Start section of the Overview of Docker Compose page. The example consists of a Python web application that consumes information from a Redis service.

Since at our company we work mainly with Microsoft technologies, we wondered, “How hard would it be to configure a similar scenario for a .NET 5 web application?” Let’s find out!

Required configuration

The first thing we need is a Linux virtual machine with Docker running in Azure. We already have one in place, but in case you don’t, in the post “Getting Started with Docker in Azure” we explain how to create a Virtual Machine in Azure directly from Visual Studio 2015 and deploy a .NET application to a Docker container in that machine.
And, of course, in order to work with Docker Compose, we need to have the feature installed in the VM. If you haven’t installed it yet, you can follow the steps outlined in the “Docker Compose: Creating Multi-Container Applications” blog post.

Preparing our .NET application

Once we have the Virtual Machine running in Azure with Docker and Docker Compose installed, we are ready to start working. So, let’s open Visual Studio 2015 and create a new ASP.NET Web Application project using the Web Site template from the available ASP.NET Preview Templates.

clip_image001[4]

Now that we have a .NET 5 application to start working on, we will modify it to reproduce the scenario we saw with Python.

As we did with the Python application, we will configure our .NET application to get information from a Redis service and also update its information. This way, we can reproduce the scenario of getting the information from Redis that tells us how many times the web page was accessed, increase the value and store the updated value back in Redis.

The first thing we need to solve here is how we should set up the communication between our web site and the Redis service without making the Redis port public. To achieve this, we will configure the Compose to set up a link between our app and the Redis service (we’ll explain how to do this later). If we specify that link, when the Compose is created, an environment variable called “REDIS_PORT_6379_TCP_ADDR” is generated by Docker. We will simply use this variable to establish communication with Redis. Let’s see what the resulting code looks like.

The following is the code corresponding to the Index action of the Home controller.

public IActionResult Index()

{

ViewBag.Message = this.GetHitsFromRedis();

return View();

}

public int GetHitsFromRedis()

{

int hits = 0;

var db = Environment.GetEnvironmentVariable(“REDIS_PORT_6379_TCP_ADDR”);

using (var redisClient = new RedisClient(db))

{

try

{

hits = redisClient.Get<Int32>(“hits”);

hits++;

redisClient.Set<Int32>(“hits”, hits);

return hits;

}

catch (Exception ex)

{

return -1;

}

}

}

 

The logic displayed above is getting the value corresponding to the “REDIS_PORT_6379_TCP_ADDR” environment variable, using it to create a Redis client, and finally using the client to get the hits value and store the updated value back to Redis.

In addition to updating the controller, we also performed some changes to the Index view and the layout to remove the HTML created by default.

For the Index view, we removed the default code and added logic to display the amount of hits (accesses to the page) obtained from the controller and the name of the machine where the web application is running. We added this second line in order to validate that the site is running in the container created by the Compose.

@{

    ViewBag.Title = “Home Page”;

}

<h3>Hits count: @ViewBag.Message</h3>

<h3>Machine Name: @Environment.MachineName</h3>

 

Regarding the layout of the site, we removed unnecessary code and left the following:

@inject IOptions<AppSettings> AppSettings

<!DOCTYPE html>

<html>

    <head>

        <meta charset=”utf-8″ />

        <meta name=”viewport” content=”width=device-width, initial-scale=1.0″ />

        <title>@ViewBag.Title – @AppSettings.Options.SiteTitle</title>

    </head>

    <body>

        <div class=”container body-content”>

            @RenderBody()

            <hr />

            <footer>

                <p>&copy; 2015 – @AppSettings.Options.SiteTitle</p>

            </footer>

        </div>

        @RenderSection(“scripts”, required: false)

    </body>

</html>

 

Once we finish updating the .NET application, we need to republish it in order to update the Docker image in the VM.

clip_image002

When the application is published, you can configure it to use any port you want. However, you need to consider that the image will be parametrized to use that port, so when we instantiate the containers to run the image without overriding the entry point, the application will use the port we configured when we published it.

Configure a Docker Compose to use the .NET web application

In order to start configuring the Compose, we will create a new directory for the required files.

We will call this new directory “netcomposetest” and create it by running:

mkdir netcomposetest

 

Then, we need to change the current directory in order to work in the new one:

cd netcomposetest

 

Once we are working in the proper directory, we will create the “docker-compose.yml” file required to create a Docker Compose. To create the file we just need to run:

> docker-compose.yml

 

Finally, we will use vi to edit the file by running:

vi docker-compose.yml

 

Once the file opens, we press i (“Insert”) to start editing the file.

In this file we need to define the services that will be part of the Docker Compose. In our case, we will configure the .NET application and the Redis service:

net:

  image: websiteondocker

  ports:

   – “210”

  links:

   – redis

redis:

  image: redis

 

As you can see, we configured a service called “net” to use the image corresponding to our .NET application. We configured it to use the same port we used when we published the app. In this case, we just provided the private port. When Docker Compose creates the container for the web app, it will assign a public port and will map it to private port 210. Additionally, below the ports specification, we defined a link to the Redis service.

Finally, to save the changes we made to the file, we just need to press Esc and then :wq (write and quit).

At this point we should be ready to create the Compose.

In order to later check that the expected containers were created, we will run the “docker ps” command to see what containers we have running before starting the Compose:

clip_image002[6]

As you can see in the previous image, the only container we have running is the one corresponding to the .NET application we published from Visual Studio.

After checking the list of running containers, we can go ahead and run the following command to create the Docker Compose:

docker-compose up -d

 

Upon running this command, the containers should be created.

clip_image001

Immediately after running the command, we ran “docker ps” to validate that the containers were running.

Although the expected containers were listed the first time we ran “docker ps“, when we ran that command again a few seconds later, the container corresponding to the .NET application was not listed anymore. We then ran the “docker ps” command adding the “-a” option to see the stopped containers, and we saw that the status of the .NET app container created by the Compose was “Exited“.

image

To figure out what happened, we ran the logs command for that container and confirmed that it had crashed:

image

While researching this issue, we found a similar issue and a workaround. The proposed workaround consists of modifying the .NET application entry point defined in the Dockerfile.

The original Dockerfile generated by Visual Studio specifies a specific entry point:

FROM microsoft/aspnet:vs-1.0.0-beta4

ADD . /app

WORKDIR /app/approot/src/{{ProjectName}}

ENTRYPOINT [“dnx”, “.”, “Kestrel”, “–server.urls”, “http://localhost:{{DockerPublishContainerPort}}“]

 

The workaround consists of applying a “sleep” before the “dnx” as follows:

FROM microsoft/aspnet:vs-1.0.0-beta4

ADD . /app

WORKDIR /app/approot/src/{{ProjectName}}

ENTRYPOINT sleep 10000000 | dnx . Kestrel –server.urls http://localhost:{{DockerPublishContainerPort}}

 

After applying this change to the Dockerfile, we republished the application from Visual Studio and ran the “docker-compose up -d” command again in order to regenerate the containers and use the new .NET application image. This time the container corresponding to the .NET app didn’t crash.

In order to validate that the site is running and working as expected, we could create an endpoint in the Azure VM for the .NET application port, but to make it simpler we will start a proxy (tutum/haproxy) in port 80 pointing to the .NET app container. To do this we need to know the container name, so we need to run “docker ps” (we could also run “docker-compose ps“):

clip_image002[8]

Since the name of the .NET application container is “netcomposetest_net_1“, we can start the proxy by running:

docker run -d -p 80:80 –link netcomposetest_net_1:netcomposetest_net_1  tutum/haproxy

 

After running this command, a new container should be created:

clip_image002[10]

Now, the .NET web application should be available in port 80:

clip_image003[4]

As you can see, the displayed machine name is the ID corresponding to the .NET app container.

And if we refresh the page, the hits count should increase.

clip_image004

Since the hits count increased as expected, we know that the application was able to connect to Redis and retrieve/update values.

That’s it! We have our .NET web application running as part of a multi-container application created using Docker Compose.

trained experiment

Creating a Machine Learning Web Service (part 2)

This is the second and last part of the Creating a  Machine Learning Web Service post series, you can find the first part here. In this post, you will see how to train your model, and evaluate it. You will also create a web service of your trained model and use it to predict credit risks.

Training the model

Once our data is pre-processed, it is time to complete the predictive model by training and testing it. We have to use some data to train the model and some data to test how well the model is able to predict the credit risk of the customers.

There are two big types of supervised machine learning techniques: Classification and regression. The former is used to predict discrete values, and the latter is used to predict a continuous set of values.

For our scenario, we want to predict discrete values: High risk & Low risk, which is why we are going to use the Classification technique.

  1. First, we are going to generate separate datasets that will be used for training and testing our model. This is done by using the Split module. Find it in the toolset and drag and drop it in the canvas below the Metadata Editor module.
  2. Connect both modules.
  3. Select the Split module and notice that by default, the Fraction of rows in the first output dataset property is set to 0.5. This means that half the data will be randomly sent to the left port, and the other half will be sent to the right port. Change that value to 0.7 to use 70% of the data to train the model and the remaining 30% to test it. You can run the model at any time to see how the dataset is evolving.
    editing fraction of rows
    running the experiment
  4. If you expand the Machine Learning node in the toolset and then Initialize Model, you will see the machine learning techniques.
    machine learning techniques
  5. Expand the Classification node, locate the Two-Class Boosted Decision Tree model, and drag it to the canvas.
  6. Find the Train Model module and drag and drop it in the canvas, too.
  7. Connect the output of the training algorithm to the left input port of the Train Model module and the left output of the Split module to the left input of the Train Model module.
  8. Select the Train Model module.
  9. In the Properties pane, click Launch column selector.
    column selector - train model
  10. Select Include in the first dropdown listbox, leave column names in the second, and enter or select “Credit Risk” in the text field. This is the value that our model is going to predict.selecting train model column
  11. Run the experiment.
    trained experiment

Score and test the model

To score the trained model we will use the data that was separated by the Split module. Knowing the expected output, we will test the predicted values and compare those results. To do this, follow these steps:

  1. Locate the Score Model module and drag it to the canvas below the Train Model module.
  2. Connect the Train Model module to the left input port of the Score Model module, and the right output port of the Split module to the right input port of the Score Model module. This can be seen in the following screenshot.
    score the model
  3. Run the experiment.
  4. Click the output port of the Score Model module and click View Results. The table will show the known values from Credit risk and the predicted values from the test data.
  5. Lastly, to test the quality of the results, find and drag the Evaluate Model module to the canvas.
  6. Connect the output port of the Score Model module to the left input port of the Evaluate Model module. You can use the Evaluate Model module to compare the performance between two different trained models.
    evaluate the model
  7. Run the experiment.
  8. Click the output port of the Evaluate Model and select View Results. This shows a graphic and metrics that are useful for comparing the results of the scored model.evaluate model results
  9. After completing the first machine learning experiment, you can try to improve your model. For example, you can improve your data pre-processing by changing the features (properties) of your dataset. Or you can change the prediction algorithm. You can even use two algorithms and compare the results using the Evaluate Model module.

Publishing the web service

Now, to make the predictive model available to everyone, we have to publish it as a web service in Azure so it will receive customer information and return its credit risk predictions.

To create the web service, we have to convert the training experiment into a scoring experiment and then publish that experiment as a web service.

To convert the training experiment to a scoring experiment, perform the following steps.

  1. Run the experiment. This will enable the Predictive Web Service option.creating the predictive experiment
  2. Point to the Set Up Web Service option of the menu, and click Predictive Web Service [Recommended]. This will update the model by removing modules that are no longer needed in the predictive experiment (for example, the training algorithm). The resulting experiment will look like the following.
    scoring experiment
  3. Run the predictive experiment.
  4. Now you can publish the web service created from our experiment. To do so, click Deploy Web Service. Machine Learning Studio will publish the web service and will take you to the service dashboard.
    publishing web service
  5. In the dashboard, click Configure to edit the settings. Here you can choose a better display name for your experiment (Credit Scoring), as well as change the name of the input and output variables.
  6. Save the changes.
  7. Go back to the Dashboard.
  8. Now we will test the web service. Click the Test button in the Default Endpoint section or Download Excel Workbook link.
  9. If you clicked Test, you can enter values for every field, except the last one (the one we want to predict). In this case, let’s download the workbook.
  10. Enable editing and Macros in the workbook. After a few seconds the fields will be generated.
  11. Enter several values in the input fields to see how the prediction is being updated.predicting values

Notice the Scored Labels and the Scored Probabilities are calculated by the model.

You can update your web service at any time. Just modify the training experiment, update the scoring experiment and finally, publish the experiment again. This will replace the web service.

Hope this has helped you introduce in the Machine Learning world.

Thanks!
Diego

blankline

Docker Compose: Creating Multi-Container Applications

Introduction

Simply deploying apps to Docker is not an architectural shift that will bring the agility, isolation and DevOps automated capabilities that a microservices approach can offer. You can always deploy a monolithic application into a Docker container.

The microservice architecture is an approach to decompose a single app as a suite of small independently deployable services communicating with each other.

There should be a bare minimum of centralized management of these services.

What is Docker Compose?

Docker Compose is an orchestration tool that makes spinning up multi-container applications effortless.

With Compose, you define a multi-container application in a single file, then spin your application up in a single command that takes care of everything to get it running.

 

While Compose is not yet considered production-ready, it is great for dev environments, staging, learning and experimenting.

Read More

Creating a Machine Learning Web Service (part 1)

This post is intended to give you a practical introduction, in how-to format, to Azure Machine Learning by showing the usage of Azure Machine Learning Studio and creating a web service that predicts whether or not a customer would make a good loan candidate. This is the first part of two posts related to the topic. For an introduction to Machine Learning you can see this introductory post by Mariano Vazquez.

> Note: Take into account that as the Machine Learning Studio is being updated frequently some steps/images may be outdated slightly.

Azure Machine Learning Studio

Microsoft Azure Machine Learning Studio is a collaborative, drag-and-drop tool you can use to build, test, and deploy predictive analytics solutions with your data, without writing even a single line of code. It uses the same interface as the Azure Management Portal. Machine Learning Studio publishes models as web services that can easily be consumed by custom apps or Business Intelligence tools.

The different sections of the Machine Learning Studio let you organize your resources. They include:

  • Experiments: This section is where you can see your existing predicting/scoring models and manage them. Additionally, sample experiments are also in this section.
  • Web Services: Here you can see the exposed web services of your trained models and manage them.
  • DataSets: In this section you can see your existing and copied DataSets, as well as samples that can be reused in other experiments.
  • Trained Models: This is a list of all the trained models in your workspace.
  • Settings: Here you can see and edit the configuration of your account and workspace.

Creating a predictive model

To use the Machine Learning Studio, first you have to create a workspace. This can be done from the Azure Management portal.

  1. Go to the MACHINE LEARNING section in the left pane, and click it to expand its options.
  2. Click the CREATE AN ML WORKSPACE option.
  3. In the QUICK CREATE form, enter the required information and click CREATE AN ML WORKSPACE.create an ml workspace
  4. After the workspace is created, select it from the list and click OPEN IN STUDIO.open in studio

Creating an experiment

  1. Create a new experiment by clicking NEW in the lower-left corner of the screen.
  2. Make sure that Experiment is selected in the left pane.
  3. Create a Blank Experiment by choosing the appropriate template in the Microsoft Samples screen. The model editor will open.microsoft samples

Getting the Data

You can use your own data or one of the sample datasets to create and train your models. In this example we will use the German Credit Card UCI dataset that contains customer information and predicts the level of risk involved in granting the user a loan.

  1. To add the sample dataset, expand the Saved Datasets and then the Samples node.
  2. Type “credit” in the search box to filter the options.
  3. Drag and drop the German Credit Card UCI dataset to the canvas.dataset
  4. You can visualize the data by clicking the output port of the dataset and selecting Visualize.
    visualizing the data
  5. The first 100 rows of the dataset will be displayed. Notice that as this is a headerless dataset, the column names are automatically filled. For machine learning, labeling the columns is not necessary. Click on any column and notice that each column type is automatically detected and a histogram is calculated with the frequency of possible values, along with other information.view dataset
  6. Scroll to the right. The last column (Col21) shows if the customer has low risk: 1, or high risk: 2. This is the value we aim to predict.
  7. Close the dataset preview.

Pre-processing Data

Pre-processing the data is necessary to correctly analyze it. This could include completing missing values, normalizing data, removing unwanted columns, etc.

  1. To simplify working with the model, we can add meaningful names to the columns using the Metadata Editor module. Use the search box to find the module, and drag and drop it in the canvas.
  2. Connect the modules.
    connect the modules
  3. Select the Metadata Editor module, and in the Properties pane, click the Launch column selector button.
  4. In the dialog box, choose All Columns in the Begin With dropdown list box.
  5. Delete the row below the Begin With dropdown box by clicking the minus sign.
    removing include row
  6. Click the check mark to close the dialog box
    .confirming select column dialog
  7. In the Properties pane, go to the New Column names field, and paste the column names, separating them with commas. The names of the fields can be obtained from the dataset documentation here. For simplicity, you can paste the following list.Status of existing checking account, Duration in months, Credit history, Purpose, Credit amount, Savings account/bonds, Present employment since, Installment rate in percentage of disposable income, Personal status and sex, Other debtors/guarantors, Present residence since, Property, Age in years, Other installment plans, Housing, Number of existing credits, Job, Number of people providing maintenance for, Telephone, Foreign worker, Credit risk
  8. To visualize the output of the Metadata Editor Module, you must run the experiment first. To do this, click Run.
    running the experiment
  9. When the experiment finishes running, click the output of the Metadata Editor module and click Visualize. Notice that the columns now have descriptive names.viewing metadata editor output

Other modules typically used for preprocessing data are:

Project Columns: Used to create a projection of the dataset. For example, to exclude unwanted columns.

Clean Missing Data: Used to specify how missing values will be handled. In this dataset there are no missing values, and therefore we won’t use this module.

Defining features

Features are measurable properties of the entity you are working with. Considering our dataset, each row represents a bank customer, and each column a feature (property) of that customer.

Most of the time you will use a subset of an entity rather than all the available features. Take into account that some features may be more useful for the predictive model than others. For example, the Credit History feature is very relevant to our scenario. Therefore, if you have lots of features for each entity, it is important to select a good subset of features for predicting the results. You can even try with different subsets to find which one works best.

To create a subset of features, you can use the Project Columns module as mentioned before. In our scenario we will keep the existing set of features.

What’s next

The next post of this series will cover how to train the model, create a scoring model to predict new values, and publish the web service. You can see it here.

See you soon!

blankline

Introduction to Azure Machine Learning

Note: If you are already familiar with machine learning you can skip this post and jump directly to the Creating a Machine Learning Web Service post by Diego Poza, which explains how you can use Azure Machine Learning with a specific example.

Machine learning is a science that allows computer systems to independently learn and improve based on past experiences or human input. It might sound like a new technique, but the reality is that some of our most common interactions with our apps and the Internet are driven by automatic suggestions or recommendations, and some companies even make decisions using predictions based on past data and machine learning algorithms.

This technology comes in handy specially when handling Big Data. Today, companies collect and accumulate data at massive, unmanageable rates (websites clicks, credit card transactions, GPS trails, social media interactions, etc.), and it’s becoming a challenge to process all the valuable information and use it in a meaningful way. This is where rule-based algorithms fall short: machine learning algorithms use all the collected, “past” data to learn patterns and predict results (insights) that helps make better business decisions.

Let’s take a look at these examples of machine learning. You may be familiar with some of them:

  • Online movie recommendation on Netflix, based on several indicators like recently watched, ratings, search results, movies similarities, etc. (see here)
  • Spam filtering, which uses text classification techniques to move potentially harmful emails to your Junk folder.
  • Credit scoring, which helps banks decide whether or not to grant loans to customers based on credit history, historical loan applications, customers’ data, etc.
  • Google’s self-driving cars, which use Computer vision, image processing and machine learning algorithms to learn from actual drivers’ behavior.

As seen in the examples above, machine learning is a useful technique to build models from historical (or current) data, in order to forecast future events with an acceptable level of reliability. This general concept is known as Predictive analytics, and to get more accuracy in the analysis you can also combine machine learning with other techniques such as data mining or statistical modeling.

In the next section, we will see how we can use machine learning in the real world, without the need to build a large infrastructure and to avoid reinventing the wheel.

What is Azure Machine Learning?

Azure Machine Learning is a cloud-based predictive analytics service for solving machine learning problems. It provides visual and collaborative tools to create predictive models that can be published as ready-to-consume web services, without worrying about the hardware or the VMs that perform the calculations.

AzureMLService

Azure Machine Learning Studio

You can create predictive analysis models in the Azure ML Studio, a collaborative, drag-and-drop tool to manage Experiments, which basically consists of datasets and algorithms to analyze the data, “train” the model and evaluate how well the model is able to predict the values. All of this can be done with no programming because it provides a large library of state of the art Machine Learning algorithms and modules, as well as a gallery of experiments authored by the community and ready-to-consume web services from Microsoft Azure Marketplace that can be purchased.

Azure Machine Learning Studio

Next steps

  • What is Azure Machine Learning Studio?
    Understand more about the Azure Machine Learning Studio workspace and what you can do with it.
  • Machine learning algorithm cheat sheet
    Investigate some of the state of the art machine learning algorithms and to help you choose the right algorithm for your predictive analytics solution. There are three main categories of machine learning algorithm: supervised learning, unsupervised learning, and reinforcement learning. The Azure Machine Learning library contains algorithms of the first two, so it might worth a look.
  • Azure Machine Learning Studio site
    Get started, read additional documentation and watch webinars about how to create your first experiment in the Azure Machine Learning Studio tool.
blankline

Getting started with Docker in Microsoft Azure

Introduction

With Docker, you can package an application with all of its dependencies into a standardized unit for software development. You can use Microsoft Azure to create a Docker VM that can host any number of containers for our applications on Azure. So, the goal of this post is to share the easiest way to create the Docker VM and deploy an ASP.NET 5 website to a Docker Container using Visual Studio Tools for Docker.

We want to highlight the Importance of using certificates and how Visual Studio uses them to connect to the server. The tool auto-generates the certificates for us to securely communicate between our development environment and our Docker VM in Azure using TLS. If you prefer not to open the Docker daemon up to the world, you need to set up the SSH port and run the client in the VM. It means that it will only allow connections from clients authenticated by a certificate signed. See more about here.

Also, keep in mind that we will be using Visual Studio 2015 and a Microsoft Azure Subscription. If you would like to see more ways to create a Docker VM in Azure, please see this post.

Well, on to the fun part! Perform the following tasks, and by the end of this post you will have created and deployed a web page to a Docker Container.

Task 1 – Creating the Web Site

In this task, you will create a website using Visual Studio 2015 RC. This website will display the OS version that is currently running.

  1. Open Visual Studio 2015 and create a new ASP.NET Web Application project using the Web Site template of the ASP.NET Preview Templates available.

selecting-web-site-template

  1. Once the project has been created, open the Views/Home/Index.cshtml file, locate the div with the carousel-caption class and replace it with the following snippet.

carousel-caption

  1. Run the application and see the result. You will see that the web application is running on Windows.

website-running-locally

Task 2 – Creating the Docker VM in Azure

In this task, you will create the Docker VM in Azure using the Visual Studio 2015 RC Tools for Docker.

  1. Go to Visual Studio and publish your web site. In the Publish Web dialog box, select Docker Containers as the publish target.

selecting-docker-containers

  1. If you are not logged in, a Visual Studio dialog box will appear. You must sign in using an account with a valid Azure subscription.
  1. Now, In the Select Docker Virtual Machine dialog box, you can select an existing Docker VM or create a new one. In this post, you will create a new Docker VM.

selecting-new-docker-vm

  1. In the Create Virtual Machine dialog box, enter a DNS name, your Username (which will be the Virtual Machine’s administrator) and Password, and click OK.

creating-docker-vm-in-azure

  1. You will now see a dialog box saying that the VM is being created in Microsoft Azure. Click OK.

vm-creation-started-dialog-box

  1. Check the progress of the VM creation in the Output window.

progress-shown-output-window

Task 3 – Deploying a website to the Docker Container

In this task, you will deploy the website to the Docker Container.

  1. Publish your website again. You will now see the connection tab with the information of our Virtual Machine created in the previous task. Notice the Server Url has been populated with the DNS name that you set when creating the Virtual Machine. Finally, click Validate Connection and then Publish.

publishing-to-docker-container

  1. Take a look at the Output window. You will see the activity while your website is deploying.

output-window-when-deploy-started

  1. Finally, you have a .NET website running on a Docker Container for Linux on Azure.

website-running-on-docker

 

Further Reading