Merriam Webster defines artificial intelligence in two ways:
1) a branch of computer science dealing with the simulation of intelligent behavior in computers, or
2) the capability of a machine to imitate intelligent human behavior.
The theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages
The central problems (or goals) of AI research include reasoning, knowledge, planning, learning, natural language processing (communication), perception and the ability to move and manipulate objects. General intelligence is among the field's long-term goals. Approaches include statistical methods, computational intelligence, and traditional symbolic AI.
There is no agreed upon definition for "Big Data." The tools of data science are as appropriate for gigabyte as they are for petabyte scale data sets. "Big data" typically refers to data on the scale of terabytes (10 to the 12th power) and petabytes (10 to the 15th power). A petabyte is a million gigabytes.
A communications network. The word "cloud" often refers to the Internet, and more precisely to a data center full of servers that is connected to the Internet. However, the term "cloud computing" refers to the software and services that have enabled the Internet cloud to become so prominent in everyday life (see cloud computing).
Hardware and software services from a provider on the Internet (the "cloud"). Cloud providers replace in-house operations and are invaluable for companies, no matter their size or applications. Cloud servers can be configured to handle tiny or huge amounts of traffic and expand or contract as needed.
Cloud computing comprises software, infrastructure and platform services (SaaS, IaaS and PaaS), all of which are explained below.
Not necessarily. Although almost any computing performed in the cloud might be labeled cloud computing, it really came about in two ways. It arose with "software as a service" (SaaS) when applications were made available to companies over the Internet in the late 1990s. For the first time, internal IT departments were no longer responsible for application maintenance. Later on, companies such as Amazon and Google with massive data center and Internet expertise offered to lease their own infrastructure: "Infrastructure as a Service" (IaaS) and "Platform as a Service" (PaaS). See Amazon Web Services and Google App Engine.
(Self Service) The customer (end user or IT professional) signs up online, activates and uses the hardware and software from start to finish without phoning a provider to set up an account. Of course, tech support is always essential.
(Scalability) Servers can be quickly configured to process more data or to handle a larger, temporary workload such as Web traffic over the holidays.
(Speed) Major cloud providers are connected to the Internet via multiple Tier 1 backbones for fast response times and fault tolerance.
SaaS - Software as a Service
SaaS providers deliver the entire application to the end user, relieving the organization of all hardware and software maintenance. Myriad applications running from a Web browser use this model, including Web-based e-mail, G-Suite and Salesforce.com's CRM. Customers pay by the number of users. For IT, this has been a paradigm shift, because security and privacy issues arise when company data are stored in the cloud.
Data analytics seeks to provide operational observations into issues that we either know we know or know we don’t know. Descriptive analytics, for example, quantitatively describes the main features of a collection of data.
The goal of Data Science, on the other hand, is to provide strategic actionable insights into the world were we don’t know what we don’t know. For example, trying to identify a future technology that doesn’t exist today, but will have the most impact on an organization in the future. Predictive analytics in the area of causation, prescriptive analytics (predictive plus decision science), and machine learning are three primary means through which actionable insights can be found.
Separating data analytics into operations and data science into strategy allows us to more effectively apply them to the enterprise solution value chain. Data analytics (EDA) leverages data assets to provided day-to-day operational insights. Everything from counting assets to predicting inventory. Data science (EDS) then seeks to exploit the vastness of information and analytics in order to provide actionable decisions that has a meaningful impact on strategy.
While there are differences and commonalities between data analytics and data science, they are both equally important. Without analytics, we would not be able to operate our factories or even pay our employees. Data analytics powers the economic engine of society. On the hand, without data science we would be suck doing the same thing over and over, our businesses would be incapable of real strategic growth. Data Science is a catalyst that moves our society through stagnation.
Faced with massive volumes and heterogeneous types of data, organizations are finding that in order to deliver insights in a timely manner, they need a data storage and analytics solution that offers more agility and flexibility than traditional data management systems. Data Lake is a new and increasingly popular way to store and analyze data that addresses many of these challenges. A Data Lake allows an organization to store all of their data, structured and unstructured, in one, centralized repository. Since data can be stored as-is, there is no need to convert it to a predefined schema and you no longer need to know what questions you want to ask of your data beforehand.
A Data Lake should support the following capabilities:
Furthermore, a Data Lake isn’t meant to be replace your existing Data Warehouses, but rather complement them. If you’re already using a Data Warehouse, or are looking to implement one, a Data Lake can be used as a source for both structured and unstructured data, which can be easily converted into a well-defined schema before ingesting it into your Data Warehouse. A Data Lake can also be used for ad hoc analytics with unstructured or unknown datasets, so you can quickly explore and discover new insights without the need to convert them into a well-defined schema.
This is also a fairly imprecise definition. Let's add a few specific properties of a data lake:
However, like many other data warehouses, yours may suffer from some of the issues I have described. If this is the case, you may choose to implement a data lake ALONGSIDE your warehouse. The warehouse can continue to operate as it always has and you can start filling your lake with new data sources. You can also use it for an archive repository for your warehouse data that you roll off and actually keep it available to provide your users with access to more data than they have ever had before. As your warehouse ages, you may consider moving it to the data lake or you may continue to offer a hybrid approach.
Data science involves using automated methods to analyze massive amounts of data and to extract knowledge from them. One way to consider data science is as an evolutionary step in interdisciplinary fields like business analysis that incorporates computer science, modeling, statistics, analytics, and mathematics to extract insights from data that help solve real world problems.
Data science is truly interdisciplinary. While mathematics and computer science are at its core, skills and advances in other disciplines are vital to its success. Improved hardware engineering, for example, can help gather and process data more efficiently. Graphic and instructional design can make findings easier to communicate and understand.
Many refer to this trending paradigm by the term Edge Computing as a way to emphasize that part of the work happens right at the edge of the network where IoT connects the physical world to the Cloud. But Edge Computing is much more than having computation and data processing on IoT devices. A fundamental part of it is the strong and seamless integration between IoT and Cloud; between the physical world and the world of computation.
An Edge Computing application uses the processing power of IoT devices to filter, pre-process, aggregate or score IoT data. It uses the power and flexibility of Cloud services to run complex analytics on those data and, in a feedback loop, support decisions and actions about and on the physical world.
We have pinpointed three main motivating factors for using Edge Computing:
1. Preserve privacy
Data captured by IoT devices can contain sensitive or private information, e.g., GPS data, streams from cameras, or microphones. While an application might want to use this information to run complex analytics in the Cloud, it is important that, whenever data leaves the premises where it is generated, the privacy of sensitive content is preserved. With Edge Computing, an application can make sure that sensitive data is pre-processed on-site, and only data that is privacy compliant is sent to the Cloud for further analysis, after having passed through a first layer of anonymizing aggregation.
2. Reduce latency
The power and flexibility of Cloud computing has enabled many scenarios that were impossible before. Think about how the accuracy of image or voice recognition algorithms has improved in recent years. However, this accuracy has a price: the time needed to get an image or a piece of audio recognized is significantly affected by the non-negligible yet unavoidable network delays due to data being shipped to the Cloud and results computed and sent back to the edge. When low-latency results are needed, Edge Computing applications can implement machine-learning algorithms that run directly on IoT devices, and only interact with the Cloud off the critical path, for example, to continuously train machine learning models using captured data.
3. Be robust against connectivity issues
Designing applications to run part of the computation directly on the Edge not only reduces latency, but potentially ensures that applications are not disrupted in case of limited or intermittent network connectivity. This can be very useful when applications are deployed on remote locations where network coverage is poor or even to reduce costs coming from expensive connectivity technologies like cellular technologies.
An end-to-end solution (E2ES) means that the provider of an application program, software and system will supply all the software as well as hardware requirements of the customer such that no other vendor is involved to meet the needs. E2ES includes installation, integration, and setup.
"A world where physical objects are seamlessly integrated into the information network, and where the physical objects can become active participants in business processes. Services are available to interact with these 'smart objects' over the Internet, query their state and any information associated with them, taking into account security and privacy issues"
"At the core of this evolution of the Internet is the idea that the Internet becomes more sensory — more proactive and less reactive. It also takes into account that the world has hit a point where there are more devices connecting to the Internet than people doing so."
The Internet of Things refers to "the network of physical objects (devices, vehicles, equipment, homes, buildings) that are connected to the internet through embedded devices and software, which allows these physical objects to collect, analyze and exchange data."
The Internet of Things is the network of physical objects that contain embedded technology to communicate and sense or interact with their internal states or the external environment.
- Tata Consultancy
Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome.
He defined machine learning as the field of study that gives computers the ability to learn without being explicitly programmed.
The emphasis of machine learning is on automatic methods. In other words, the goal is to devise learning algorithms that do the learning automatically without human intervention or assistance. The machine learning paradigm can be viewed as “programming by example.”Often we have a specific task in mind, such as spam filtering. But rather than program the computer to solve the task directly, in machine learning, we seek methods by which the computer will come up with its own program based on examples that we provide. Machine learning is a core subarea of artificial intelligence.