Monday, 28 November 2022

GCP Service - Storage

GCP Service: Storage

Google cloud provides 3 main services for different storage classes 

  • Persistent discs: For block storage

A persistent disc can be considered similar to a USB drive, which can be connected and disconnected from the system as and when required. It can also be compared to HDD and SSD, depending on the use case and requirement. Block storage can also be considered, as data is saved in blocks and doesn't rely on a single path like file storage. Each block is self-contained and thus manageable but costly.

  • File Storage: Network file storage
File storage provides disc storage over the network. It is similar to the storage process when papers are kept in files and then a particular leaf is plugged based on requirements. Files are organized similarly in file storage; to retrieve a file, the path must be known.
  • Cloud Storage: Object Storage
Object storage is a flat storage in which the files are broken and spread across hardware, hence all objects are at the same level. Object storage does not store all data in the same file and contains metadata, information about the file that is used for processing, and usability.
 
Google Cloud Storage has different storage classes. 
  • Standard: The standard storage class is used for data that is frequently used.
  • Nearline storage is used when data is needed only once every 30 days.
  • Coldline: The coldline storage class is used when data is needed once every 90 days.
  • Archive -When data is only needed once a year, the archive class is used.This class is used for data archiving, online backup, and disaster recovery.

Wednesday, 19 October 2022

GCP Service - Compute

 GCP Service - Compute

Google compute service offers businesses to run their services on GCP platform.

Following are the key offerings from GCP in compute

  • Compute Engine
Compute Engine is a Infrastructure as a Service (IaaS).It delivers configurable VM which runs on Google infrastructure and delivers high performance.
  • App Engine
App Engine is a Platform as a Service (PaaS.It is a platform to build scalable web apps and mobile back ends.
  • Kubernetes Engine
Kubernetes Engine is a Container as a Service.GKE provides a managed environment for deploying, managing and scaling containerized application using Google infrastructure.

  • Cloud Run
Cloud run is a managed compute platform that enables you to run containers that can be revoked using web request or events

  • Cloud Functions
Cloud Function is a Function as a Service (FaaS). Cloud functions are light weight solution that are generally used for single purpose solution, these are event driven server less platform.

Google Cloud Platform (GCP )

 Google cloud platform is a platform provided by Google that provides cloud computing services on which Google runs its own services.

GCP has competitors such as AWS (Amazon) and Azure (Microsoft)

GCP provides services for compute, storage, databases, data analytics, AI and Machine Learning,Networking,developer tools, etc.

Some of the services from GCP are

Compute - Create and run customizable VM machines 

  • Compute Engine
  •  App Engine
  •  Kubernetes Engine
  • Cloud Functions
  • Cloud Run

 Storage

  •  Cloud Storage
  •  Cloud Filestore
  • Persistent Disks

Databases

  • Cloud Bigtable
  • Cloud Firestore
  • Cloud SQL
  • Cloud Spanner
  •  MemoryStore
Networking

  • Cloud Virtual Network
  • Cloud Load Balancing
  • Cloud CDN
  • Cloud Interconnect
  • Cloud DNS


Wednesday, 28 September 2022

Optimizers in Machine Learning

Optimizer is an algorithm or a function that is used to modify parameters such as weights and learning rate to improve the performance of model by reducing loss and improving the accuracy.

Different optimizers

  1. Gradient Descent
  2. Stochastic Gradient Descent
  3. Stochastic Gradient Descent with Momentum
  4. Mini Batch Gradient Descent
  5. Adagrad
  6. RMSProp
  7. AdaDelta
  8. Adam


Activation Functions

 Activation function defines the output of the node basis on input.

Types of Activation Functions

1) Step Function

If x >=0, y=1

If x<0, y=0

2) Sigmoid Function 

This function ranges from (0,1) and is defined as 

S(x) =  1/(1+e^(-x))

3) ReLU Function

Rectified linear unit, it will output the input directly if input is positive, else output will be zero.

y = max(0,x)

4) Leaky ReLU

Leaky Rectified linear unit, this activation function provides small slope for negative values instead of flat slope.

y = ax if x < 0

y  = x

5) ELU

Exponential Linear Unit

y = x if x x> 0

y = a((e^x)-1))

If x value is less than 0, output will be slightly less than 0.



Monday, 26 September 2022

Artificial Neural Network

  Artificial Neural Networks (ANN) are inspired by the human brain. ANN is made up of three layers: an input layer, a hidden layer or layers, and an output layer. These networks learn from training data and gradually improve their accuracy. Once they are trained, these networks become powerful tools to recognize patterns.
Some important terminology in ANN 

  • Weights
  • Bias
  • Learning Rate
  • Threshold
How does neural network work?
 
Once input is fed through the input layer, weights are assigned, and these weights help to understand the importance of each variable. The higher the weight, the greater the importance.
After weights are assigned, weights are multiplied with individual variables and summation is done if summation crosses a given threshold and the basis of activation function output is determined.
 
The input layer's output is routed through the hidden layer(s) and then to the output layer in a process known as "feed forward neural network."
A loss is calculated based on output and actual value. The objective is to minimize the loss value.


Friday, 25 March 2022

Hugging Face Introduction

 Hugging face provides libraries for Natural Language Processing (NLP) using transformers. Hugging face can be use for following tasks

  • Sentiment Analysis - Provides the score in terms of positive and negative
  • Zero Shot Classification-It classifies a text into the mention topics by allocating percentage to each topic.
  • Text Generation - Generates the summary basis on short text passed
  • Mask Filling-If a word is hidden in a string, this method is used for prediction of the word
  • Named Entity Recognition-It classifies the entities into predefined categories such as organization, locations, quantities, etc.
  • Question Answering - Basis on the context passed in pipeline, this feature answers the questions
  • Summarization - It summarizes the long text into short summary
  • Translation - Translates the text from one language to other

 

Monday, 14 March 2022

Python libraries for Natural Language Processing

 Python provides multiple libraries to handle text data,some of the prominent libraries are

1) Natural Language Toolkit (NLTK)
One of the most commonly used library for tokenization, lemmatization ,stemming,parsing, chunking and POS tagging.

2) Gensim
This library is majorly used for topic modelling,document indexing and similarity retrieval with large corpus.Algorithms in Gensim are memory independent.

3) spaCy - Open source NLP library is majorly used for production usage for huge volumes of data.It supports tokenization for over 49 languages.