Wednesday, October 16, 2019

Data Analysis with Python - Model Evaluation and Refinement

Data Analysis with Python - Model Evaluation and Refinement

import pandas as pd
import numpy as np

# Import clean data
path = 'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/module_5_auto.csv'
df = pd.read_csv(path)


df.to_csv('module_5_auto.csv')


First lets only use numeric data

df=df._get_numeric_data()
df.head()


Libraries for plotting

%%capture
! pip install ipywidgets


from IPython.display import display
from IPython.html import widgets
from IPython.display import display
from ipywidgets import interact, interactive, fixed, interact_manual



Functions for plotting

def DistributionPlot(RedFunction, BlueFunction, RedName, BlueName, Title):
    width = 12
    height = 10
    plt.figure(figsize=(width, height))

    ax1 = sns.distplot(RedFunction, hist=False, color="r", label=RedName)
    ax2 = sns.distplot(BlueFunction, hist=False, color="b", label=BlueName, ax=ax1)

    plt.title(Title)
    plt.xlabel('Price (in dollars)')
    plt.ylabel('Proportion of Cars')

    plt.show()
    plt.close()



def PollyPlot(xtrain, xtest, y_train, y_test, lr,poly_transform):
    width = 12
    height = 10
    plt.figure(figsize=(width, height))
   
   
    #training data
    #testing data
    # lr:  linear regression object
    #poly_transform:  polynomial transformation object

    xmax=max([xtrain.values.max(), xtest.values.max()])

    xmin=min([xtrain.values.min(), xtest.values.min()])

    x=np.arange(xmin, xmax, 0.1)


    plt.plot(xtrain, y_train, 'ro', label='Training Data')
    plt.plot(xtest, y_test, 'go', label='Test Data')
    plt.plot(x, lr.predict(poly_transform.fit_transform(x.reshape(-1, 1))), label='Predicted Function')
    plt.ylim([-10000, 60000])
    plt.ylabel('Price')
    plt.legend()



Part 1: Training and Testing


y_data = df['price']

x_data=df.drop('price',axis=1)

from sklearn.model_selection import train_test_split


x_train, x_test, y_train, y_test = train_test_split(x_data, y_data, test_size=0.15, random_state=1)


print("number of test samples :", x_test.shape[0])
print("number of training samples:",x_train.shape[0])


number of test samples : 31
number of training samples: 170


x_train1, x_test1, y_train1, y_test1 = train_test_split(x_data, y_data, test_size=0.4, random_state=0)
print("number of test samples :", x_test1.shape[0])
print("number of training samples:",x_train1.shape[0])

number of test samples : 81
number of training samples: 120


Let's import LinearRegression from the module linear_model.
from sklearn.linear_model import LinearRegression


We create a Linear Regression object:
lre=LinearRegression()

we fit the model using the feature horsepower
lre.fit(x_train[['horsepower']], y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)


Let's Calculate the R^2 on the test data:

lre.score(x_test[['horsepower']], y_test)
0.707688374146705

we can see the R^2 is much smaller using the test data.
lre.score(x_train[['horsepower']], y_train)


Find the R^2 on the test data using 90% of the data for training data

x_train1, x_test1, y_train1, y_test1 = train_test_split(x_data, y_data, test_size=0.1, random_state=0)
lre.fit(x_train1[['horsepower']],y_train1)
lre.score(x_test1[['horsepower']],y_test1)

0.7340722810055448



Thursday, October 3, 2019

Polynomial Regression

import matplotlib.pyplot as plt
import pandas as pd
import pylab as pl
import numpy as np
%matplotlib inline

!wget -O FuelConsumption.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv

Understanding the Data
FuelConsumption.csv:
We have downloaded a fuel consumption dataset, FuelConsumption.csv, which contains model-specific fuel consumption ratings and estimated carbon dioxide emissions for new light-duty vehicles for retail sale in Canada. Dataset source

MODELYEAR e.g. 2014
MAKE e.g. Acura
MODEL e.g. ILX
VEHICLE CLASS e.g. SUV
ENGINE SIZE e.g. 4.7
CYLINDERS e.g 6
TRANSMISSION e.g. A6
FUEL CONSUMPTION in CITY(L/100 km) e.g. 9.9
FUEL CONSUMPTION in HWY (L/100 km) e.g. 8.9
FUEL CONSUMPTION COMB (L/100 km) e.g. 9.2
CO2 EMISSIONS (g/km) e.g. 182 --> low --> 0

Reading the data in
df = pd.read_csv("FuelConsumption.csv")

# take a look at the dataset
df.head()
MODELYEAR MAKE MODEL VEHICLECLASS ENGINESIZE CYLINDERS TRANSMISSION FUELTYPE FUELCONSUMPTION_CITY FUELCONSUMPTION_HWY FUELCONSUMPTION_COMB FUELCONSUMPTION_COMB_MPG CO2EMISSIONS
0 2014 ACURA ILX COMPACT 2.0 4 AS5 Z 9.9 6.7 8.5 33 196
1 2014 ACURA ILX COMPACT 2.4 4 M6 Z 11.2 7.7 9.6 29 221
2 2014 ACURA ILX HYBRID COMPACT 1.5 4 AV7 Z 6.0 5.8 5.9 48 136
3 2014 ACURA MDX 4WD SUV - SMALL 3.5 6 AS6 Z 12.7 9.1 11.1 25 255
4 2014 ACURA RDX AWD SUV - SMALL 3.5 6 AS6 Z 12.1 8.7 10.6 27 244
Lets select some features that we want to use for regression.

cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB','CO2EMISSIONS']]
cdf.head(9)
cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB','CO2EMISSIONS']]
cdf.head(9)
ENGINESIZE CYLINDERS FUELCONSUMPTION_COMB CO2EMISSIONS
0 2.0 4 8.5 196
1 2.4 4 9.6 221
2 1.5 4 5.9 136
3 3.5 6 11.1 255
4 3.5 6 10.6 244
5 3.5 6 10.0 230
6 3.5 6 10.1 232
7 3.7 6 11.1 255
8 3.7 6 11.6 267
Lets plot Emission values with respect to Engine size:

plt.scatter(cdf.ENGINESIZE, cdf.CO2EMISSIONS,  color='blue')
plt.xlabel("Engine size")
plt.ylabel("Emission")
plt.show()

Creating train and test dataset
Train/Test Split involves splitting the dataset into training and testing sets respectively, which are mutually exclusive. After which, you train with the training set and test with the testing set.

msk = np.random.rand(len(df)) < 0.8
train = cdf[msk]
test = cdf[~msk]


Polynomial regression¶


Sometimes, the trend of data is not really linear, and looks curvy. In this case we can use Polynomial regression methods. In fact, many different regressions exist that can be used to fit whatever the dataset looks like, such as quadratic, cubic, and so on, and it can go on and on to infinite degrees.

In essence, we can call all of these, polynomial regression, where the relationship between the independent variable x and the dependent variable y is modeled as an nth degree polynomial in x. Lets say you want to have a polynomial regression (let's make 2 degree polynomial):

𝑦=𝑏+𝜃1𝑥+𝜃2𝑥2
Now, the question is: how we can fit our data on this equation while we have only x values, such as Engine Size? Well, we can create a few additional features: 1,  𝑥 , and  𝑥2 .

PloynomialFeatures() function in Scikit-learn library, drives a new feature sets from the original feature set. That is, a matrix will be generated consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, lets say the original feature set has only one feature, ENGINESIZE. Now, if we select the degree of the polynomial to be 2, then it generates 3 features, degree=0, degree=1 and degree=2:

from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
train_x = np.asanyarray(train[['ENGINESIZE']])
train_y = np.asanyarray(train[['CO2EMISSIONS']])

test_x = np.asanyarray(test[['ENGINESIZE']])
test_y = np.asanyarray(test[['CO2EMISSIONS']])


poly = PolynomialFeatures(degree=2)
train_x_poly = poly.fit_transform(train_x)
train_x_poly
array([[ 1.  ,  2.4 ,  5.76],
       [ 1.  ,  1.5 ,  2.25],
       [ 1.  ,  3.5 , 12.25],
       ...,
       [ 1.  ,  3.2 , 10.24],
       [ 1.  ,  3.  ,  9.  ],
       [ 1.  ,  3.2 , 10.24]])
fit_transform takes our x values, and output a list of our data raised from power of 0 to power of 2 (since we set the degree of our polynomial to 2).

𝑣1𝑣2⋮𝑣𝑛   ⟶   [1[1⋮[1𝑣1𝑣2⋮𝑣𝑛𝑣21]𝑣22]⋮𝑣2𝑛]
in our example

2.2.41.5⋮   ⟶   [1[1[1⋮2.2.41.5⋮4.]5.76]2.25]⋮
It looks like feature sets for multiple linear regression analysis, right? Yes. It Does. Indeed, Polynomial regression is a special case of linear regression, with the main idea of how do you select your features. Just consider replacing the  𝑥  with  𝑥1 ,  𝑥21  with  𝑥2 , and so on. Then the degree 2 equation would be turn into:

𝑦=𝑏+𝜃1𝑥1+𝜃2𝑥2
Now, we can deal with it as 'linear regression' problem. Therefore, this polynomial regression is considered to be a special case of traditional multiple linear regression. So, you can use the same mechanism as linear regression to solve such a problems.

so we can use LinearRegression() function to solve it:

clf = linear_model.LinearRegression()
train_y_ = clf.fit(train_x_poly, train_y)
# The coefficients
print ('Coefficients: ', clf.coef_)
print ('Intercept: ',clf.intercept_)
Coefficients:  [[ 0.         51.86678532 -1.70694689]]
Intercept:  [105.78768144]
As mentioned before, Coefficient and Intercept , are the parameters of the fit curvy line. Given that it is a typical multiple linear regression, with 3 parameters, and knowing that the parameters are the intercept and coefficients of hyperplane, sklearn has estimated them from our new set of feature sets. Lets plot it:

plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS,  color='blue')
XX = np.arange(0.0, 10.0, 0.1)
yy = clf.intercept_[0]+ clf.coef_[0][1]*XX+ clf.coef_[0][2]*np.power(XX, 2)
plt.plot(XX, yy, '-r' )
plt.xlabel("Engine size")
plt.ylabel("Emission")
Text(0, 0.5, 'Emission')

Evaluation
from sklearn.metrics import r2_score

test_x_poly = poly.fit_transform(test_x)
test_y_ = clf.predict(test_x_poly)

print("Mean absolute error: %.2f" % np.mean(np.absolute(test_y_ - test_y)))
print("Residual sum of squares (MSE): %.2f" % np.mean((test_y_ - test_y) ** 2))
print("R2-score: %.2f" % r2_score(test_y_ , test_y) )
Mean absolute error: 24.77
Residual sum of squares (MSE): 1073.01
R2-score: 0.63
Practice
Try to use a polynomial regression with the dataset but this time with degree three (cubic). Does it result in better accuracy?
# write your code here
poly3 = PolynomialFeatures(degree=3)
train_x_poly3 = poly3.fit_transform(train_x)
clf3 = linear_model.LinearRegression()
train_y3_ = clf3.fit(train_x_poly3, train_y)
# The coefficients
print ('Coefficients: ', clf3.coef_)
print ('Intercept: ',clf3.intercept_)
plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS,  color='blue')
XX = np.arange(0.0, 10.0, 0.1)
yy = clf3.intercept_[0]+ clf3.coef_[0][1]*XX + clf3.coef_[0][2]*np.power(XX, 2) + clf3.coef_[0][3]*np.power(XX, 3)
plt.plot(XX, yy, '-r' )
plt.xlabel("Engine size")
plt.ylabel("Emission")
test_x_poly3 = poly3.fit_transform(test_x)
test_y3_ = clf3.predict(test_x_poly3)
print("Mean absolute error: %.2f" % np.mean(np.absolute(test_y3_ - test_y)))
print("Residual sum of squares (MSE): %.2f" % np.mean((test_y3_ - test_y) ** 2))
print("R2-score: %.2f" % r2_score(test_y3_ , test_y) )

Coefficients:  [[ 0.         32.67645817  3.58375078 -0.43783753]]
Intercept:  [126.05296936]
Mean absolute error: 24.71
Residual sum of squares (MSE): 1068.14
R2-score: 0.64

Tuesday, October 1, 2019

Multiple Linear Regression

Importing Needed packages
import matplotlib.pyplot as plt
import pandas as pd
import pylab as pl
import numpy as np
%matplotlib inline

Downloading Data
!wget -O FuelConsumption.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv

Understanding the Data
Reading the data in
df = pd.read_csv("FuelConsumption.csv")

# take a look at the dataset
df.head()

Lets plot Emission values with respect to Engine size:
cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_CITY','FUELCONSUMPTION_HWY','FUELCONSUMPTION_COMB','CO2EMISSIONS']]
cdf.head(9)

Creating train and test dataset
plt.scatter(cdf.ENGINESIZE, cdf.CO2EMISSIONS,  color='blue')
plt.xlabel("Engine size")
plt.ylabel("Emission")
plt.show()


msk = np.random.rand(len(df)) < 0.8
train = cdf[msk]
test = cdf[~msk]

Train data distribution
plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS,  color='blue')
plt.xlabel("Engine size")
plt.ylabel("Emission")
plt.show()

Multiple Regression Model
from sklearn import linear_model
regr = linear_model.LinearRegression()
x = np.asanyarray(train[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB']])
y = np.asanyarray(train[['CO2EMISSIONS']])
regr.fit (x, y)
# The coefficients
print ('Coefficients: ', regr.coef_)


Ordinary Least Squares (OLS)
OLS is a method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by minimizing the sum of the squares of the differences between the target dependent variable and those predicted by the linear function. In other words, it tries to minimizes the sum of squared errors (SSE) or mean squared error (MSE) between the target variable (y) and our predicted output ( 𝑦̂  ) over all samples in the dataset.

Prediction
y_hat= regr.predict(test[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB']])
x = np.asanyarray(test[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB']])
y = np.asanyarray(test[['CO2EMISSIONS']])
print("Residual sum of squares: %.2f"
      % np.mean((y_hat - y) ** 2))

# Explained variance score: 1 is perfect prediction
print('Variance score: %.2f' % regr.score(x, y))

Residual sum of squares: 485.04
Variance score: 0.88

explained variance regression score:
If  y_hat  is the estimated target output, y the corresponding (correct) target output, and Var is Variance, the square of the standard deviation, then the explained variance is estimated as follow:

𝚎𝚡𝚙𝚕𝚊𝚒𝚗𝚎𝚍𝚅𝚊𝚛𝚒𝚊𝚗𝚌𝚎(y𝑦,y_hat)=1−𝑉𝑎𝑟{𝑦−𝑦̂ }𝑉𝑎𝑟{𝑦}
The best possible score is 1.0, lower values are worse

Simple Linear Regression

import matplotlib.pyplot as plt
import pandas as pd
import pylab as pl
import numpy as np
%matplotlib inline

!wget -O FuelConsumption.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/FuelConsumptionCo2.csv

df = pd.read_csv("FuelConsumption.csv")

# take a look at the dataset
df.head()

# summarize the data
df.describe()

cdf = df[['ENGINESIZE','CYLINDERS','FUELCONSUMPTION_COMB','CO2EMISSIONS']]
cdf.head(9)

viz = cdf[['CYLINDERS','ENGINESIZE','CO2EMISSIONS','FUELCONSUMPTION_COMB']]
viz.hist()
plt.show()

plt.scatter(cdf.FUELCONSUMPTION_COMB, cdf.CO2EMISSIONS,  color='blue')
plt.xlabel("FUELCONSUMPTION_COMB")
plt.ylabel("Emission")
plt.show()

plt.scatter(cdf.ENGINESIZE, cdf.CO2EMISSIONS,  color='blue')
plt.xlabel("Engine size")
plt.ylabel("Emission")
plt.show()

plt.scatter(cdf.CYLINDERS, cdf.CO2EMISSIONS, color='blue')
plt.xlabel("Cylinders")
plt.ylabel("Emission")
plt.show()

msk = np.random.rand(len(df)) < 0.8
train = cdf[msk]
test = cdf[~msk]

plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS,  color='blue')
plt.xlabel("Engine size")
plt.ylabel("Emission")
plt.show()



from sklearn import linear_model
regr = linear_model.LinearRegression()
train_x = np.asanyarray(train[['ENGINESIZE']])
train_y = np.asanyarray(train[['CO2EMISSIONS']])
regr.fit (train_x, train_y)
# The coefficients
print ('Coefficients: ', regr.coef_)
print ('Intercept: ',regr.intercept_)

Coefficients:  [[39.30843025]]
Intercept:  [125.07479893]

plt.scatter(train.ENGINESIZE, train.CO2EMISSIONS,  color='blue')
plt.plot(train_x, regr.coef_[0][0]*train_x + regr.intercept_[0], '-r')
plt.xlabel("Engine size")
plt.ylabel("Emission")

Text(0, 0.5, 'Emission')

from sklearn.metrics import r2_score

test_x = np.asanyarray(test[['ENGINESIZE']])
test_y = np.asanyarray(test[['CO2EMISSIONS']])
test_y_hat = regr.predict(test_x)

print("Mean absolute error: %.2f" % np.mean(np.absolute(test_y_hat - test_y)))
print("Residual sum of squares (MSE): %.2f" % np.mean((test_y_hat - test_y) ** 2))
print("R2-score: %.2f" % r2_score(test_y_hat , test_y) )

Mean absolute error: 23.61
Residual sum of squares (MSE): 959.87
R2-score: 0.70

Saturday, September 28, 2019

Major Machine Learning Techniques

Major Machine Learning Techniques

* Regression/Estimation
-> Predicting continuous values
Algorithms: Linear Regression, Non-Linear Regression, Multiple Linear Regression


* Classification
-> Predicting the item class/category of a case
Algorithms: K-Nearest Neighbours, Decision Trees, Logistic Regression, Support Vector Machine


* Clustering
-> Finding the structure of data; summarization
Algorithms: k-Means Clustering, Hierarchical Clustering, Density-based Clustering

* Associations
-> Assoicating frequent co-occurring items/events

* Anomaly Detection
-> Discovering abnormal and unusual cases

* Sequence Mining
-> Predicting next events; click-stream (Markov Model, HMM)

* Dimension Reduction
-> Reducing the size of data (PCA)

* Recommendation Systems
-> Recommending items
Algorithms: Content-based Recommendation Engines, Collaborative Filtering


Data Science Methodology

Data Science Methodology

  1. Business Understanding
  2. Analytic Approach
  3. Data Requirements
  4. Data Collection
  5. Data Understanding
  6. Data Preparation
  7. Modelling
  8. Evaluation
  9. Deployment
  10. Feedback
Data Science Methodology

Thursday, September 12, 2019

Docker: Dockerfile

===Dockerfile===

FROM ubuntu:15.04
COPY . /app
RUN make /app
CMD python /app/app.py

use .dockerignore

===Working with Instructions===

sudo yum install git -y

mkdir docker_images

cd docker_images

mkdir weather-app
cd weather-app

git clone https://github.com/linuxacademy/content-weather-app.git src

vi Dockerfile

FROM node
LABEL org.label-schema.version=v1.1
RUN mkdir -p /var/node
ADD src/ /var/node
WORKDIR /var/node
RUN npm install
EXPOSE 3000
CMD ./bin/www

docker image build -t linuxacademy/weather-app:v1 .

docker image ls

docker container run -d --name weather-app1 -p 8081:3000 linuxacademy/weather-app:v1

docker container ls

curl localhost:8081

===Environment Variables===

cd docker_images
mkdir env
cd env

git clone https://github.com/linuxacademy/content-weather-app.git src

vi Dockerfile

FROM node
LABEL org.label-schema.version=v1.1
ENV NODE_ENV="development"
ENV PORT 3000

RUN mkdir -p /var/node
ADD src/ /var/node/
WORKDIR /var/node
RUN npm install
EXPOSE $PORT
CMD ./bin/www

docker image build -t linuxacademy/weather-app:v2 .

docker image ls

docker image inspect linuxacademy/weather-app:v2

docker container run -d --name weather-dev -p 8082:3001 --env PORT=3001 linuxacademy/weather-app:v2

docker container ls

curl localhost:8082

docker container inspect weather-dev

docker container run -d --name weather-app2 -p 8083:3001 --env PORT=3001 --env NODE_ENV=production linuxacademy/weather-app:v2

docker container inspect weather-app2


curl localhost:8083

docker container logs weather-app2

docker container run -d --name weather-prod -p 8084:3000 --env NODE_ENV=production linuxacademy/weather-app:v2

docker container logs weather-prod

curl localhost:8084

===Build Args===

cd docker_images
mkdir args
cd args

git clone https://github.com/linuxacademy/content-weather-app.git src

vi Dockerfile

FROM node
LABEL org.label-schema.version=v1.1
ARG SRC_DIR=/var/node

RUN mkdir -p $SRC_DIR
ADD src/ $SRC_DIR
WORKDIR $SRC_DIR
RUN npm install
EXPOSE 3000
CMD ./bin/www

docker image build -t linuxacademy/weather-app:v3 --build-arg SRC_DIR=/var/code .

docker image inspect linuxacademy/weather-app:v3 | grep WorkingDir

docker container run -d --name weather-app3 -p 8085:3000 linuxacademy/weather-app:v3

curl localhost:8085


===Working with Non-privileged User===

cd docker_images
mkdir non-privileged-user
cd non-privileged-user

vi Dockerfile

FROM centos:latest
RUN useradd -ms /bin/bash cloud_user
USER cloud_user

docker image build -t centos7/nonroot:v1 .

docker container run -it --name test-build centos7/nonroot:v1 /bin/bash

bash$ sudo su
bash$ su -
bash$ exit

docker container ls

docker container start test-build

docker container exec -u 0 -it test-build /bin/bash

bash$ whoami
bash$ exit

cd ~/docker_images
mkdir node-non-privileged-user
cd node-non-privileged-user

vi Dockerfile

FROM node
LABEL org.label-schema.version=v1.1
RUN useradd -ms /bin/bash node_user
USER node_user
ADD src/ /home/node_user
WORKDIR /home/node_user
RUN npm install
EXPOSE 3000
CMD ./bin/www

git clone https://github.com/linuxacademy/content-weather-app.git src

docker image build -t linuxacademy/weather-app-nonroot:v1 .

docker container run -d --name weather-app-nonroot -p 8086:3000 linuxacademy/weather-app-nonroot:v1

curl localhost:8086

===Order of Execution===

cd docker_images
mkdir centos-conf
cd centos-conf

vi Dockerfile

FROM centos:latest
RUN mkdir -p ~/new-dir1
RUN useradd -ms /bin/bash cloud_user
RUN mkdir -p /etc/myconf
RUN echo "Some config data" >> /etc/myconf/my.conf
USER cloud_user
RUN mkdir -p ~/new-dir2

docker image build -t centos7/myconf:v1 .

===Using the Volume Instruction===

cd docker_images
mkdir volumes
cd volumes

vi Dockerfile

FROM nginx:latest
VOLUME ["/usr/share/nginx/html/"]

docker image build -t linuxacademy/nginx:v1 .

docker container run -d --name nginx-volume linuxacademy/nginx:v1

docker container inspect nginx-volume

docker volume inspect volume-id

sudo ls -la /var/lib/docker/volumes/volume-id/_data

Wednesday, September 11, 2019

Docker: Container Logging



docker container run --name weather-app -d -p 80:3000 linuxacademycontent/weather-app

docker container ls

docker container logs container_id

docker container logs container_id

docker container run -d --name ghost_blog \
-e database__client=mysql \
-e database__connection_host=mysql \
-e database__connection_user=root \
-e database__connection_password=password \
-e database__connection_database=ghost \
-p 8080:2368 \
ghost:1-alpine

docker container ls

docker container ls -a

docker container logs container_id

===Summary===
Create a container using the weather-app image.
docker container run --name weather-app -d -p 80:3000 linuxacademycontent/weather-app
Show information logged by a running container:
docker container logs [NAME]
Show information logged by all containers participating in a service:
docker service logs [SERVICE]
Logs need to be output to STDOUT and STDERR.
Nginx Example:
RUN ln -sf /dev/stdout /var/log/nginx/access.log \
    && ln -sf /dev/stderr /var/log/nginx/error.log
Debug a failed container deploy:
docker container run -d --name ghost_blog \
-e database__client=mysql \
-e database__connection__host=mysql \
-e database__connection__user=root \
-e database__connection__password=P4sSw0rd0! \
-e database__connection__database=ghost \
-p 8080:2368 \
ghost:1-alpine

Docker: Executing Container Commands

docker container run -d nginx

docker container ls

docker container run -it nginx /bin/bash

# nginx -g 'daemon off;'

docker container ls

docker container inspect container_id

curl 172.17.0.3


# exit

docker container ls

docker container ls -a

docker container exec -it container_id ls /usr/share/nginx/html
docker container exec -it container_id /bin/bash

# apt-get update -y
# exit

docker container prune

docker container rm -f container_id

===Summary===
Executing a command:
  • Dockerfile
  • During a Docker run
  • Using the exec command
Commands can be:
  • One and done Commands
  • Long running Commands
Start a container with a command:
docker container run [IMAGE] [CMD]
Execute a command on a container:
docker container exec -it [NAME] [CMD]
Example:
docker container run -d -p 8080:80 nginx
docker container ps
docker container exec -it [NAME] /bin/bash
docker container exec -it [NAME] ls /usr/share/nginx/html/

Docker: Exposing and Publishing Container Ports

===Exposing Container Ports===

docker container run -d nginx

docker container ls

curl localhost

docker inspect container container_id

curl 172.17.0.2

docker container run -d --expose 3000 nginx

docker container ls

docker rm -f container_id

docker container run -d --expose 3000 -p 80:3000 nginx

docker container ls

curl localhost:3000
connection refused

docker container rm -f container_id


docker container run -d --expose 3000 -p 8080:80 nginx

curl localhost:8080

docker container run -d -p 8081:80/tcp -p 8081:80/udp nginx


curl localhost:8081

docker container run -d -P nginx

docker container ls

curl localhost:32768

docker container port container_id

===Summary===

Exposing:
  • Expose a port or a range of ports
  • This does not publish the port
  • Use --expose [PORT]
docker container run --expose 1234 [IMAGE]
Publishing:
  • Maps a container's port to a host`s port
  • -p or --publish publishes a container's port(s) to the host
  • -P, or --publish-all publishes all exposed ports to random ports
docker container run -p [HOST_PORT]:[CONTAINER_PORT] [IMAGE]
docker container run -p [HOST_PORT]:[CONTAINER_PORT]/tcp -p [HOST_PORT]:[CONTAINER_PORT]/udp [IMAGE]
docker container run -P
Lists all port mappings or a specific mapping for a container:
docker container port [Container_NAME]

Docker - Creating Containers

docker container run --help

docker container run busybox

docker container ls

docker container ls -a

docker container run --rm busybox

docker container ls -a

docker container run nginx

docker container run -d nginx

docker container ls

docker container ls -a

docker container run -it busybox

# ls
# exit

docker container prune -f

docker container run --name my_busybox busybox

docker container ls -a

===Summary===
docker container run:
  • --help Print usage
  • --rm Automatically remove the container when it exits
  • -d--detach Run container in background and print container ID
  • -i--interactive Keep STDIN open even if not attached
  • --name string Assign a name to the container
  • -p--publish list Publish a container's port(s) to the host
  • -t--tty Allocate a pseudo-TTY
  • -v--volume list Mount a volume (the bind type of mount)
  • --mount mount Attach a filesystem mount to the container
  • --network string Connect a container to a network (default "default")
Create a container and attach to it:
docker container run –it busybox
Create a container and run it in the background:
docker container run –d nginx
Create a container that you name and run it in the background:
docker container run –d –name myContainer busybox

Docker Commands

docker -h | more

docker image -h

docker image ls -h

docker image ls

docker image pull nginx


docker image ls

docker image inspect image_id

docker container -h

docker container ls

docker container run busybox

docker container ls

docker container ls -a

docker container run -P -d nginx

docker container ps

docker container inspect container_id

curl http://172.17.0.2

docker container inspect container_id

docker container top container_id

docker container ls
docker container attach container_id

docker container ls

docker container ls -a

docker container start container_id

docker container ls

docker container stop container_id
docker container start container_id

docker container logs container_id

docker container ls

curl localhost:32774

docker container logs container_id

docker container ls

docker container stats container_id

docker container exec -it container_id /bin/bash

# ls
# ls /usr/share/nginx/html/
# exit

docker container ls

docker container exec -it container_id ls /usr/share/nginx/html/

docker container ls

docker container pause container_id

docker container ls

docker container unpause container_id


docker container ls -a

docker container rm -f container_id

docker container ls -a

docker container prune

docker container ls -a

docker container prune -h

docker container prune -f

===Summary===
Get a list of all of the Docker commands:
docker -h

Management command were introduced in Docker engine v1.13

Management Commands:
  • builder Manage builds
  • config Manage Docker configs
  • container Manage containers
  • engine Manage the docker engine
  • image Manage images
  • network Manage networks
  • node Manage Swarm nodes
  • plugin Manage plugins
  • secret Manage Docker secrets
  • service Manage services
  • stack Manage Docker stacks
  • swarm Manage Swarm
  • system Manage Docker
  • trust Manage trust on Docker images
  • volume Manage volumes
docker image:
  • build Build an image from a dockerfile
  • history Show the history of an image
  • import Import the contents from a tarball to create a filesystem image
  • inspect Display detailed information on one or more images
  • load Load an image from a tar file or STDIN
  • ls List images
  • prune Remove unused images
  • pull Pull an image or a repository from a registry
  • push Push an image or a repository to a registry
  • rm Remove one or more images
  • save Save one or more images to a tar file (streamed to STDOUT by default)
  • tag Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
docker container:
  • attach Attach local standard input, output, and error streams to a running container
  • commit Create a new image from a container's changes
  • cp Copy files/folders between a container and the local filesystem
  • create Create a new container
  • diff Inspect changes to files or directories on a container's filesystem
  • exec Run a command in a running container
  • export Export a container's filesystem as a tar archive
  • inspect Display detailed information on one or more containers
  • kill Kill one or more running containers
  • logs Fetch the logs of a container
  • ls List containers
  • pause Pause all processes within one or more containers
  • port List port mappings or a specific mapping for the container
  • prune Remove all stopped containers
  • rename Rename a container
  • restart Restart one or more containers
  • rm Remove one or more containers
  • run Run a command in a new container
  • start Start one or more stopped containers
  • stats Display a live stream of container(s) resource usage statistics
  • stop Stop one or more running containers
  • top Display the running processes of a container
  • unpause Unpause all processes within one or more containers
  • update Update configuration of one or more containers
  • wait Block until one or more containers stop, then print their exit codes

Friday, March 15, 2019

Kubernetes Reference

Kubernetes Reference

Minikube

minikube start

minikube stop

minikube delete

minikube env

minikube ip



---
Kubectl

kubectl get all
Pods, ReplicaSets, Deployments and Services

kubectl apply –f <yaml file>

kubectl apply –f .

kubectl describe pod <name of pod>

kubectl exec –it <pod name> <command>

kubectl get <pod | po | service | svc | rs | replicaset | deployment | deploy>

kubectl get po --show-labels

kubectl get po --show-labels -l {name}={value}

kubectl delete po <pod name>

kubectl delete po --all



---
Deployment Management

kubectl rollout status deploy <name of deployment>

kubectl rollout history deploy <name of deployment>

kubectl rollout undo deploy <name of deployment>

Docker Reference

Docker Reference

Manage images

docker image pull <image name>

docker image ls

docker image build -t <image name> .

docker image push <image name>

docker image tag <image id> <tag name>



---
Manage Containers

docker container run -p <public port>:<container port> <image name>

docker container ls -a

docker container stop <container id>

docker container start <container id>

docker container rm <container id>

docker container prune

docker container run -it <image name>

docker container run -d <image name>

docker container exec -it <container id> <command>

docker container exec -it <container id> bash

docker container logs -f <container id>

docker container commit -a "author" <container id> <image name>



---
Manage your (local) Virtual Machine

docker-machine ip



---
Manage Networks

docker network ls

docker network create <network name>



---
Manage Volumes

docker volume ls

docker volume prune

docker volume inspect <volume name>

docker volume rm <volume name>



---
Docker Compose

docker-compose up

docker-compose up -d

docker-compose logs -f <service name>

docker-compose down



---
Manage a Swarm

docker swarm init (--advertise-addr <ip address>)

docker service create <args>

docker network create --driver overlay <name>

docker service ls

docker node ls

docker service logs -f <service name>

docker service ps <service name>

docker swarm join-token <worker|manager>



---
Manage Stacks

docker stack ls

docker stack deploy -c <compose file> <stack name>

docker stack rm <stack name>



Thursday, March 14, 2019

ElasticSearch PUT and GET data

--Create an index in Elasticsearch

PUT http://host-1:9200/my_index

{
"settings" : {
"number_of_shards" : 3,
"number_of_replicas" : 1
}
}

--To get information about an index

GET http://host-1:9200/my_index


--Add user to index with id 1

POST http://host-1:9200/my_index/user/1

{
"name": "Deepak",
"age": 36,
"department": "IT",
"address": {
"street": "No.123, XYZ street",
"city": "Singapore",
"country": "Singapore"
}
}

--To fetch document with id 1

GET http://host-1:9200/my_index/user/1

--Add user to index with id 2

POST http://host-1:9200/my_index/user/2

{
"name": "McGiven",
"age": 30,
"department": "Finance"
}

--Add user to index with id 3

POST http://host-1:9200/my_index/user/3

{
"name": "Watson",
"age": 30,
"department": "HR",
"address": {
"street": "No.123, XYZ United street",
"city": "Singapore",
"country": "Singapore"
}
}

--Search documents by name

GET http://host-1:9200/my_index/user/_search?q=name:watson

--Delete an index

DELETE http://host-1:9200/my_index