MongoDB Basic   Leave a comment

MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document.

Database

Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.

In case of Mongo DB also database is like any other database.

Collection

Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields.

Document

A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection’s documents may hold different types of data.

 

In MongoDB, Primary Key is a Default key _id provided by mongodb itself.

For example:

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Basic', 
   description: 'MongoDB is no sql database',
   tags: ['mongodb', 'database', 'NoSQL'],
}
where _id is 12 bytes key which help to maintain the uniqueness.

Advantages of MongoDB over RDBMS

  • Schema less
  • Structure of a single object is clear.
  • No complex joins.
  • Deep query-ability.
  • Tuning.
  • MongoDB is easy to scale.
  • Uses internal memory for storing the working set, enabling faster access of data.
Why to use MongoDB?
Here are some features which will help to understand when to use it and why to use
  • Document Oriented Storage − Data is stored in the form of JSON style documents.
  • Index on any attribute
  • Replication and high availability
  • Auto-sharding
  • Rich queries
  • Fast in-place updates
  • Professional support by MongoDB
Advertisements

Posted March 10, 2018 by Izharuddin Shaikh in MongoDB

Tagged with

Database Server Scaling Strategies   Leave a comment

One of the difficult task for the organization now a days is the maintaining their huge data in efficient way. The main concern for them is to make sure that there is no loss of information and all the time system should be up and running.

There two scaling techniques available which helping them in achieving their goal

1) Horizontal Scaling
2) Vertical Scaling

Horizontal Scaling: This can be achieved by scaling database on multiple server by replicating it.
Vertical Scaling: This can be achieved by adding more hardware.

Companies having huge data and there is huge transactions are facing issues like Increased load of their user base, scalable data storage, high availability.

Factors which help in managing the data store are:
Transactions
Relational data
Fixed schemas
Immediate vs. Eventual Consistency
Data Locking

By keeping above factors in keep mind and what is best suit for your requirement, one of the below mentioned replication approach has to be taken

Master-Slave Replication

Tree Replication

Master-Master Replication

Ring of Masters

Each of the above having advantages and dis-advantages over one another; so one has to select the best which suit the requirement.

Other things which need to be keep in mind are:

Replication Lag and DB-partitioning-Clustering

Federation / Sharding and how cross joining will be apply 

 

The above information is at very high level, you will have to go in detail to understand each factor and then take the final decision for your requirement.

Posted January 27, 2018 by Izharuddin Shaikh in General

Tagged with ,

Single sign-on flow   Leave a comment

It is a four steps process:

When browser tries to access other application the interceptor redirects it to SSO server
The browser submits previously authenticated token (cookie) to the SSO server
SSO server validates and forwards browser with access token and other user related information to interceptor
Based on user related attributes the interceptor allows the browser to access the application

Posted November 13, 2017 by Izharuddin Shaikh in General

Tagged with

Factors for Capacity Planing   Leave a comment

We need to consider below-mentioned factors while doing the hardware capacity planning. These factors help to understand what would be the best hardware resources should be available in order to get the best performance.

1) Get the number users expected in a day.
2) Get the peak hour(s) and percentage of a total number of users during that period.
3) Average user visit time.
4) A number of pages visits during average user time.
5) Find out static and dynamic resources on certain pages.
6) Ask for headroom factor.

Based on above input, calculate, the total number of concurrent users in 1 sec, throughout (page rate), throughout (requests).

Once you done with a calculation, consider a server with good configuration and do the testing and find out what out server gives in terms of x number of requests.

Based on the output from that server, increase the number of servers and calculate the output. Once you reach to a level where you find your requirement is filling then finalize the configuration for capacity.

Posted November 5, 2017 by Izharuddin Shaikh in General

Tagged with

Marshaling   Leave a comment

In computer science, marshaling is the process of transforming the memory representation of an object to a data format suitable for storage or transmission, and it is typically used when data must be moved between different parts of a computer program or from one program to another. Marshaling is similar to serialization and is used to communicate to remote objects with an object.

Posted September 10, 2017 by Izharuddin Shaikh in General

Tagged with

Terminology in SQL Part 2   Leave a comment

Extent – the basic unit in which space is managed. An extent is eight physically contiguous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte.

Uniform extents – owned by a single object; all eight pages in the extent can only be used by the owning object.

Mixed extents – shared by up to eight objects. Each of the eight pages in the extent can be owned by a different object.

Allocation Unit – a set of particular types of pages.

Partition – is a unit of data organization.

Heap – a table without a clustered index.

IAM – Index Allocation Map-the page that keeps track of all the pages allocated to a heap. (Can be more than one)

B+ Trees – B-tree stands for “balanced tree,” and SQL Server uses a special kind called B+ trees (pronounced “b-plus trees”) that are not kept strictly balanced in all ways at all times. Unlike a normal tree, B-trees are always inverted, with their root (a single page) at the top and their leaf level at the bottom.

Root Node – The top node of the B+ tree is called the root node.

SGAM – tracks shared extents.

GAM- tracks an allocation event.

Posted September 9, 2017 by Izharuddin Shaikh in SQL

Tagged with

Terminology in SQL Part 1   Leave a comment

Page – The fundamental unit of data storage in SQL Server

DBCC PAGE – allows you to examine the contents of data and index pages.

DBCC IND – lists of all database pages that make up the selected index or partition.

Data Page – stores data,except text, ntext, image, nvarchar(max), varchar(max), varbinary(max), and xml data, when text in row is set to ON.

Log Files – a series of log records.

Extents – a collection of eight physically contiguous pages and are used to efficiently manage the pages.

.mdf – Primary data file.

.ndf – Secondary data file.

Posted September 9, 2017 by Izharuddin Shaikh in SQL, Uncategorized

Tagged with

%d bloggers like this: