Wednesday, July 15

12) Recap - What is Big Data?

Quick & short recap on 'What is Big Data?'

Today we have got computers is in our pocket, in cash registers, in cars, in TV , in our credit card & everywhere else. Data is being generated by these computers (aka devices) and it is up to the companies to decide if they want to tap this data.  The hardware cost have gone down considerably and companies have realized that they can afford to store more of their data & this analyzing this data will give them a 'more detailed insight of what they are, what their customers want, how companies can improve themselves, how to retain existing customer & get new customer & also save operations cost at the same time'.

According to a research estimate Health Care industry can save up to 300 Billion $. These are just numbers but the fact is Heath Care industry is an early starter and companies have used Big Data to optimize their processes and deliver better, faster,. There are transport companies that are using big dat to optimize the trains running on the lines by optimizing the routes, by optimizing their engines and are helping companies save billions of dollars in fuel cost.

The change has started and companies are adopting Big Data based on the opportunities. The fact is with Big Data related technologies it is possible to store large amount of data, process it in 1/10th of the time and derive value from the data. Companies who adapt to the change over next 10 years will survive the competition. Just collecting data is not enough and unless the data is analyzed companies will not get the value from their Big Data. 





11) A Big Data Use Case - Retail Industry


Let's take an example of a Online Retail Company to understand how they can leverage today's software technologies to leverage Big Data

Challenge faces by online retail company:
•The need to examine massive amounts of unstructured social media and search data to find out what are the 'products that consumers are talking about'
•The growing data volumes causing a major storage problem - leading to data regret on a regular basis
• The need to strategize the ad buying strategy on sites like Google, with the goal of competing for e-commerce sales
• The need to track products, sales, and customers (pet bytes of data) to win pricing concessions from suppliers.

Solution
• Primary basis of solution is co-location of storage and compute layer
• Solution proposes using 'Hadoop' for efficient data transformation
• Solution proposes large proportion of analysis to be performed by Hadoop, MapReduce

   Result
* Over-night processing of data now completes in minutes each day, enabling faster and improved search results
* Data volumes are reduced by as large margin of more than 60%
* Faster analytics that quickly react to changing customer sentiments & market conditions.

Technologies in the solution
• Hadoop, MapReduce, Hive , Pig, Flume, Pentaho, Java 

Monday, July 13

10) Value of Big Data - All that is old is not Gold & all that is Big Data is not Valuable

The Raw Data from various channels like sensors, instruments, social media usually has to be processed to filter the 'noise or junk data that has no business value' before it be consumed by business system. 

When Big Data is created by a system (a device or an enterprise application or an external source like Facebook) it can either be directly consumed (Ex- For Complex Event Processing) or the data has to be pre-processed before storing in database and this moving data is called as 'Data in Motion'

When data is finally stored in some database or a warehouse it is called 'Data at Rest'. There are different benefits that can be extracted from 'Data in Motion & Data at Rest' and the chart below explains the typical steps followed in Big Data processing before Big Data is leveraged for some business outcome. To perform analytic on the data it has to be pre-processed and stored in a database - analytics cannot be performed on raw Big Data.


Monday, July 6

9) How are Digital Technologies changing the way enterprises (& governments) work?

Digital technologies are changing the way businesses works and also the way employees do their work. I will try to explain the value of Digital Technologies and how they help build Real Time Enterprise Systems that help companies & government to continuously improve & improvise their business process to have a competitive edge in business. In the image below I have explained high level steps to build 'Real Time Enterprise Systems' that leverage standard enterprise data as well as Big Data ( In the image below read, outer circle 1st & then the inner circles)

1) Traditional & Big Data Sources : For any enterprise data is being created by
sensor devices, instruments, emails, social media and the data is of text,audio,video and audio format. Lots of the data is 'noise' and data needs to be cleansed and filtered using smart algorithm before it can be consumed as 'information'
2) Integrated Data & Systems :  Enterprise has many systems & applications and some of them are in silo. A integration bus is essential to integrates the systems and insures smart & secured flow of information across the enterprise.

3) Processed Data : Enterprise runs by taking informed decisions by applying business rules & Business processes to the business data. Data processing & data enrichment is an essential part of a smart enterprise that can take 'Real Time' decisions. Data Processing also involves removing 'noise' from the Big Data that is being created by various sources.


4) Integrated Business Process  : BPM (Business Process Management), Rules Engines, Portal, Mobile applications are the technologies that  help implement integrated BPM in the enterprise and help build an smart enterprise that  takes informed decisions in Real Time as the business events occur in any department in any part of the world. Another advantage of BPM is that a large amount of business decisions can be automated (& work 24/7) & thus business processing can be accelerated. End to end integration ensures that any 'notable event' across the enterprise is monitored & automated decisions are taken in real time.

5) Real Time Enterprise taking Intelligent Business Decisions: The most competitive enterprise have to be Real Time Enterprises. A real time enterprise is an smart enterprise that can take decisions in real time based on information made available to the business. A real time enterprise reduces manual work and uses software to automate a large number of business process and reduces human intervention. The circle in the center of the image above is 'Intelligent Business Decisions' which is the value of implementing a smart real time enterprise by implementing the 4 step steps in the outer circles of the image above. I hope this quick overview was helpful. Do message me if you have any questions.

Wednesday, July 1

8) Making a success of Digital India Program kicked off by PM on 1st July 2015

Digital India Program is a new project by Indian government to extend e-governance to the "Gram Panchayat' thus connecting the central government to the basic governance body in smallest entity of our population which is 'The Village" (Gram is Hindi word for Village").

Is Digital India a Revolutionary Step? Yes, it is. This is the first time theIndian  government has committed to connect 'basic governance body of the country to the central governance system' and allocated funds for the program. It is a big thing. This kick starts the Digital India in reality because 'Villages or the Gram' form 75% of India and unless they are connected 'In Real Time' the government is not connected to its people.  The program, according to Govt of India has already got 4.5 Lakh Crore by companies which is very promising.

Here are some key steps that government should take to ensure that Digital India is a success

1) Connectivity : Internet connectivity for all villages in India - Major Challenge
2) Training: Computer Education for office holders of villages & districts - Minor challenge
3) Infrastructure: Electricity or generators for each 'Gram or Village' to use computer.
4) Data Storage: Data is the most important aspect of e-governance. Governments will have to plan for PPP to build a Cloud Data Storage that can be used by Public & Government bodies - from Gram Panchayat and above.
5) Security - Biggest concerns of e-enabling any data is the security risk.
6) Application Design: Mobile friendly design for all government portal applications - Minor Challenge
7) Scalable & High Availability Applications: Existing systems will not be able to handle the huge load of new entities and the applications have to be made scalable and I see this as a major architecture challenge.
8) Continuous Improvement: No systems is perfect and there is need to setup 2 way Communication between Government & Gram Panchayat to get feedback and improve the systems and make them more user friendly and robust
9) Banking:  Banks have to make mobile banking more user friendly and they should reach villages. Public sector banks will have to really push their M-Banking for Digital India to be successful

These are the top 9 concerns that the government should address for successful Digital India initiative.

Tuesday, June 30

What Indian goerment should consider for planning Smart City

Smart Cities Mission, sometimes referred to as Smart City Mission, is an urban renewal and retrofitting program by the Government of India with the mission to develop 100 smart cities across the country making them citizen friendly and sustainable. The Union Ministry of Urban Development is responsible for implementing the mission in collaboration with the state governments of the respective cities.

Well defined vision,good planning and technology leaders/architects are needed to build this smart city ecosystem. They need to operate in the intersection of technology, innovation, business, operations, strategy and people. This is the “no man’s” land where traditional boundaries, processes, policies and rules fail. This is where the hardest problems are. and that's the key challenge to implement smart city.

In building the cities of tomorrow, these smart city ecosystem architects must focus on some key areas:

1. Break silos and build bridges. 

A sustainable and well functioning smart city is a seamless integration and smooth orchestration of people, processes, policies and technologies working together across the smart city ecosystem. The architects unify teams across municipal departments to achieve the goal. There is need to connect public and private organizations within the ecosystem & build consensus to co-create the new city.

2. Sound Vision and well defined goals. 

A smart city is not about technology, but about using technology together with the various ecosystem layers to create the ecosystem. These results should be aligned around the needs of the city – government efficiency, sustainability, health and wellness, mobility, economic development, public safety and quality of life.

3. Engage a broader community of innovators. 

Within the smart city, innovation and value creation comes not only from municipal agencies but from businesses, communities (business districts, “smart” buildings, housing complexes), and individual residents. Smart city ecosystem architects unify the various layers to enable, incentivize, facilitate and scale this larger community to co-create the smart city together.

4. Invest in policy making and partnerships at the beginning

Policies and partnerships are the catalysts of the smart city. Policies and partnerships leverage and amplify limited city resources and capabilities, help to scale faster, while minimizing risk. The smart city ecosystem architects address the needs of policymakers, technologists and innovators to create sensible policies that create the right outcomes. Policy makers need to create platform to proactively seek out public and private collaborators and build sustainable and synergistic partnerships.

5. Create unified data and not data islands

Data is the lifeblood of the smart city. Open data, generated by municipal organizations, is only one source of data. When supplemented with data created by businesses and private citizens, it yields richer insights and better outcomes. Smart city ecosystem architects utilize the full extent of the ecosystem to create “unified data”. They plan and build data marketplaces, robust data sharing and privacy policies, data analytics skills, and monetization models that facilitate the sourcing and usage of “city data”.

6. Manage connectivity as a strategic capability. 

While connectivity is mission critical, today’s smart city ecosystem architects are faced with several challenges – unequal access to basic connectivity, inadequacy of existing services & a confusing array of emerging wireless network. In the smart city, connectivity is not an option nor is it someone else’s problem to solve. Smart city architects must lead with new policies and public private partnerships. They must develop new innovative investment strategies & create new connectivity ecosystems with city owned, service provider owned, and community owned infrastructure

7. Smart City needs modern IT infrastructure. 

Most of the smart city infrastructure is confused integration of legacy systems, purpose built departmental technology and smart city point solutions. Cities must modernize their digital infrastructure, while expanding integration to the broader external ecosystem. Cyber-security and technology policies, processes and systems must be revised to be smart city centric, not IT centric. Digital skills, from data analytics, machine learning to software engineering, must be the new competencies of the smart city.

8. Design  Secure Systems

The smart city is only as smart as the trust its stakeholders have in it. Smart city architects must design for trust across the entire ecosystem. The technology infrastructure must be secure. Information collected must be protected, and used protecting owners’ privacy. Policies, legislation and technology must be continuously aligned to maintain the right balance of protection, privacy, transparency & utility. The infrastructure must be robust, resilient and reliable.


Monday, June 29

7) Discussing Digital - What is Digitization? What are the advantages of Digitization? What do we mean by Digital India?

What is' Digital'?  What is Digitization?

Digital information exists as one of two digits, either 0 or 1. The term Digitization refer to converting information of diverse nature (ex text, audio, video, image) getting converted to binary code. Digitization is 1st step in making information available to share and collaborate across the government and enabling software systems to implement Business Process Management (which means to automate the government business over internet).

Digitization has many advantages
1) Information in digital format can be made accessible to users via internet in a controlled and secure manner. In analog format (paper/ photograph, video tape) it is not easy to share the information.
2) Information can be easily stored and maintained preserved as compared to analog format. 
3) With cost of hardware and internet bandwidth decreasing the 'operations cost to store and maintain information'
4) Once information is in digital format it can be shared and consumed across 'Business Process' from different Business Systems and partners and improve the speed of doing business.
5) Digitized data can be shared, searched, processed & analyzed using sophisticated software and this helps in providing Business Insights to the enterprise (public, private, government) enabling the enterprise do better business. 
Once can imaging Digitization as a process to pull information from different papers and putting it into a single 'word document'. Once the information is in this word document one can share it with another department in the company, search the documents, extract information from the document to complete some 'business process' and at the same time one can 'provide secured access' to the document to selected people based on their access right. 

What is Digital India program? 
Digital India program is Indian government's program to 'Digitize' all information created & consumed by the government agencies (such that it can be saved, shared, searched in a secured manner by the government & its partners) & integrating all public and government agencies from 'Villages, to Towns, to Cities to Indian Government systems'.  Digital India will enable government to  expand its existing e-governance program to entire population & provide a seamless e-governance. 

One can say that digital revolution started way back when 'internet of things' was made affordable and available to common man. E-governance took off in a big way in private companies & government over last 10 years and most Indian states have implemented e-governance to some degree. E-Business model became successful in 1990's and shortly Governments started implementing E-Governance by e-enabling Government to Government services, Government to Citizen services and Government to Business services.  

Saturday, June 27

6) What real life business problems does Big Data solutions solve?

What types of business problems can a big data platform help you address? 

a) Each industry have multiple sources of Big Data (Sensor devices data, Location specific data, GPS data, Email, social networking data, transaction data,instrument logs etc) 

b) Each industry has unique 'information' that can be extracted from big data – by analyzing larger volumes of data than was previously 'not possible', to derive precise answers, to analyzing 'big data in motion as it is created' and to capitalize on the business opportunities that were previously lost. 

c) A big data solution enables the organization to tackle complex problems that previously could not be solved because of the complexity due to sheer volume, required processing speed, the number of different data sources that needed to be processed and the time required to process the entire set of data

Here are a few examples of industries that are leveraging Big Data : 

1) Healthcare Industry : Healthcare industry is among the top 10 industries that is leveraging Big Data. A hospital can reduce patient mortality rate by using Big Data solution to analyze huge amount of patient health data & use it to aid diagnosis and better treatment for the patient. 

 2) Telecommunication : By using Big Data solution for analyzing CDR (call data records) and switch and tower data telecom companies can reduce the processing time by as much as 75% 

3) Electricity & Power Industry : By using Big Data solution power companies can analyze the logs and prevent power outages by performing preventive repairs. 

4) Airline Industry : Airplane has a complex software management system and it generates a huge amount of data when a plane flies. By using Big Data solution airlines are analyzing the plane instrument logs to detect issues and perform preventive maintenance. 

 I have given few examples from real life solutions implemented by various industries and there are many more examples for different industries where big data has been processed and consumed to give the Business a competitive edge.

Wednesday, June 24

5) Who is creating Big Data? How fast?

Just in case you have not seen this image of how data is getting created every minute - this minute!  (source pininterest)

 
Another favorite image  that gives an idea of Big Data generation is this one (source wikibon)
 

 
 

4) Why today's business should leverage their company's Big Data?

Integration of Big Data solution (that leverages Big Data relevant to the company) with the traditional Business Intelligence solution is what will give the complete value to any business. As the competition increases and business are looking for 'intelligence' to improve their product, reduce inventory, increase sales, understand customer sentiments,  prevent losses and wastage, it is imperative that all data 'relevant' sources are analyzed and tapped for information and the data is leveraged to give 'Edge' to the company business.

The following table gives a comparison of BI vs New Age BI leveraging Big Data

 

3) Big Data Characteristics - 5 Vs of Big Data


To understand Big Data let's discuss the characteristics of Big Data. Big Data has 5 dimensions (or characteristics) : Volume, Variety, Velocity, Veracity and Value. Let's briefly go through the popular definitions of the 5 V's

1) Volume: Volume refers to the vast amounts of data generated every second. If we take all the data generated in the world between the beginning of time and 2008, the same amount of data will soon be generated every minute. This makes most data sets too large to store and analyze using traditional database technology. New big data tools use distributed systems so that we can store and analyze data across databases that are dotted around anywhere in the world.
2) Variety: Variety refers to the variety of data generated today. Text, Audio, Video, Device Data, GPS data, Facebook data, Call Data Records, Air Flight Logs and 100s of other data types contribute to Big Data.
3) Velocity: Velocity refers to the High Speed at which data is getting generated today. For example- Data generated by Stock Exchanges is high speed data, GPS of a travelling car or a plane generates data at high velocity, Each mobile towers generates CDR data at very high velocity and one of the Big Data challenge is how to process huge volume of data that is generated at such high velocity.
4) Veracity:  Having a lot of data in different volumes coming in at high speed is worthless if that data is incorrect. Incorrect data can cause a lot of problems for organizations as well as for consumers. Therefore, organizations need to ensure that the data is correct as well as the analyses performed on the data are correct. Especially in automated decision-making, where no human is involved anymore, you need to be sure that both the data and the analyses are correct.
5) Value - All the data generated by different devices may or may not have any value for your business. While designing the Big Data solution it is required to decide which data is relevant for business and also filter the 'Noise Data' before you store & process the Big Data.

 The following image from IBM is my favorite Big Data infographic. Picture this image and you will never forget the key characteristics of Big Data. 

Some data scientists consider 'Visualization' as the 6th V of Big Data but I do not agree that Visualization is a characteristic of Big Data. So what is visualization? Is it related to Big Data? What is Big Data Analytics? Visualization is a discipline of business analytics and it is about using tools to play with your data & analyze it to derive business value. Tools like Tableau, Qlikview are some of the leading visualization tools. We will discuss Visualization in my future post. 

Thursday, June 18

2) What is driving Big Data?

Current patterns of thought on storage, compute and analytics are being challenged.

1) Need to Dealing with Unstructured Data (Ex- Log Processing,Firewall activity, Image/ Video processing,Seismic processing)
2) Need To Reducing Data Storage & Processing Costs (Ex-Move ETL / ETL into parallel environment,  Pre-Proessing EDW, Integrating Enterprise DW with unstructured data sources)
3) Demand for Large Scale Data Analytics (Ex-Modeling the individual user,Large data sets without sampling,Cross enterprise data sets)
4) Require Agile Business Intelligence (Ex-Flexible “schema on read” data space, The “magnetic” data warehouse)

Which means there is a need to integrate the New Gen Data with traditional Business Intelligence Data


1) What is Big Data?

What is Big Data?
•Big Data comes into play as data sets become big enough to obscure underlying meaning and traditional methods of storing, accessing, and analyzing break down. Adding unstructured or semi-structured data to the mix creates additional layers of complexity.
Corporations are dealing with exploding quantities of data…
Lots of available data is unused; lots of unstructured data is being generated and…
The demand to combine this data and explore it to drive deeper insights, to predict rather than react is creating a demand for complex analytical capabilities with agility.

Image source - http://blog.sqlauthority.com

Sunday, February 1

GOTO 2014 • Microservices • Martin Fowler


GOTO 2014 • Microservices • Martin Fowler

https://www.youtube.com/watch?v=wgdBVIX9ifA&list=TLPQMjcxMjIwMTn4vRu8EOXJ-g&index=3



Friday, January 30

Understanding the SOA Reference Architecture

What is SOA?

A quick refresh of  SOA definition, a service-oriented architecture is essentially a collection of services. These services communicate with each other. The communication can involve either simple data passing or it could involve two or more services coordinating some activity.  Service-oriented architectures are not a new thing. The first service-oriented architecture for many people in the past was with the use DCOM or Object Request Brokers (ORBs) based on the CORBA specification 
Wikipedia defines SOA (Service-oriented architecture) as a style of software design where services are provided to the other components by application components, through a communication protocol over a network. The basic principles of service-oriented architecture are independent of vendors, products and technologies.[1] A service is a discrete unit of functionality that can be accessed remotely and acted upon and updated independently, such as retrieving a credit card statement online.

Key points to note are that a service has four properties according to one of many definitions of SOA:
  •     It logically represents a business activity with a specified outcome.
  •     It is self-contained.
  •     It is a black box for its consumers.
  •     It may consist of other underlying services

Key Business Benefits of SOA


As a flexible and extensible architectural framework, SOA has the following defining capabilities:

  1.  Reducing Cost:Through providing the opportunity to consolidate redundant application functionality and decouple functionality from obsolete and increasingly costly applications while leveraging existing investments.
  2. Agility: Structure business solutions based on a set of business and IT services in such as way as to facilitate the rapid restructuring and reconfiguration of the business processes and solutions that consume them.
  3. Increasing Competitive Advantage: Provide the opportunity to enter into new markets and leverage existing business capabilities in new and innovative ways using a set of loosely-coupled IT services. Potentially increase market share and business value by offering new and better business services.
  4. Time-to-Market :Deliver business-aligned solutions faster by allowing the business to decide on the key drivers of a solution and allowing IT to rapidly support and implement that direction.
  5. Consolidation :Integrate across silo’ed solutions and organizations, reduce the physical number of systems, and enable consolidation of platforms under a program of “graceful transition” from legacy spaghetti dependencies to a more organized and integrated set of coexisting systems.
  6. Alignment : SOA enables organizations to better align IT to business goals, enabling the business to associate IT with capabilities that an organization wants to achieve in alignment with its strategic plan, leading to both sustained agility and re-use over time.

However, significant challenges in creating an SOA solution still remain:

  •     Service identification,
  •     Service selection,
  •     Service design,
  •     Solution element selection and combination,
  •     Service modelling
  •     Service governance,
  •     Interoperability and the ability to identify different components key to the effective design, usage, and evolution of SOA.

For example, from a technical perspective, the architect needs to answer questions such as:
  1.     What are the considerations and criteria for producing an SOA solution?
  2.     How can an SOA solution be organized as an architectural framework with inter-connected architectures and transformation capabilities?
  3.     How can an SOA solution be designed in a manner that maximizes asset re-use?
  4.     How can automated tools take the guesswork out of architecture validation and capacity planning?

In order to address these issues, there is need for a SOA Reference Architecture for SOA-based solutions. SOA reference architecture provides a high-level abstraction of an SOA partitioned and factored into layers, each of which provides a set of capabilities required to enable the working of an SOA. Each layer addresses a specific set of characteristics and responsibilities that relate to unique value propositions within an SOA. As mentioned above, underlying this layered architecture is a meta-model consisting of layers, capabilities, Architecture Building Blocks (ABB) or ABB, interactions, patterns, options, and architectural decisions and the relation between capabilities, ABB, and layers. These will guide the architect in the creation and evaluation of the architecture.  Likewise, an ABB represents a basic element of re-usable functionality, providing support for one or more capabilities, that can be realized by one or more components or products; examples of the responsibilities of an ABB include: service definition, mediation, routing, etc.

MUSTREAD : How can you use Index Funds to help create wealth? HDFC MF Weekend Bytes

https://www.hdfcfund.com/knowledge-stack/mf-vault/weekend-bytes/how-can-you-use-index-funds-help-create-wealth?utm_source=Netcore...