Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides

852 Articles
article-image-analyzing-the-chief-data-officer-role
Aaron Lazar
13 Apr 2018
9 min read
Save for later

GDPR is pushing the Chief Data Officer role center stage

Aaron Lazar
13 Apr 2018
9 min read
Gartner predicted that by 2020 90% of large organizations in regulated industries will have a Chief Data Officer role. With the recent heat around Facebook, Mark Zuckerberg, and the fast-approaching GDPR compliance deadline, it’s quite likely that 2018 will be the year of the Chief Data Officer. This article was first published in October, 2017 and has been updated to keep up with the latest trends in GDPR. In 2014 around 400 CEOs and top business execs were asked how they recognize data as a corporate asset. They responded with a mixed set of reactions and viewed the worth of data in their organization in varied ways. Now, in 2018 these reactions have drastically changed – more and more organizations have realized the importance of Data-as-an-Asset. More importantly, the European Union (EU) has made it mandatory that General Data Protection Regulation (GDPR) compliance be sought by the 25th of May, 2018. The primary reason for creating the Chief Data Officer role was to connect the ends between functional management and IT teams in an organization. But now, it looks like the CDO will also be primarily focusing on setting up and driving GDPR compliance, to avoid a fine up to €20 million or 4% of the annual global turnover of the previous year, whichever is higher. We’re going to spend a few minutes breaking down the Chief Data Officer role for you, revealing several interesting insights along the way. Let’s start with the obvious question. What might the Chief Data Officer’s responsibilities be? Like other C-suite execs, a Chief Data Officer is expected to have a well-blended mix of technical know-how and business acumen. Their role is very diverse and sometimes comes across as a pain to define the scope. Here are some of the key responsibilities of a Chief Data Officer: Data Policies and GDPR Compliance Data security is one of the most important elements that any business must consider. It needs to comply with regulatory standards and requirements of the country where it operates. A Chief Data Officer is be responsible for ensuring the compliance of policies across all branches of business and the associated compliance requirement taxonomies, on a global level. What is GDPR? If you’re thinking there were no data protection laws before the General Data Protection Regulation 2016/679, that’s not true. The major difference, however, is that GDPR focuses more on customer data privacy and protection. GDPR requirements will change the way organizations store, process and protect their customers’ personal data. They will need solutions for assessing, implementing and maintaining GDPR compliance, and that’s where Chief Data Officers fit in. Using data to gain a competitive edge Chief Data Officers need to have sound knowledge of the business’ customers, markets where they operate, and strong analytical skills to leverage the right data, at the right time, at the right place. This would eventually give the business an edge over its competitors in the market. For example, a ferry service could use data to identify the rates that customers would be willing to pay at a certain time of the day. Setting in motion the best practices of data governance Organizations span across the globe these days and often employees from different parts of the world work on the same data. This can often result in data moving through unconnected systems, and ending up as inefficient or disjointed pieces of business information. A Chief Data Officer needs to ensure that this information is aggregated and maintained in such a way that clear information ownership across the organization is established. Architecting future-proof data solutions A Chief Data Officer often acts like a Data Architect. They will sometimes take on responsibility for planning, designing, and building Big Data systems and ensuring successful integration with other systems in an organization. Designing systems that can provide answers to the user’s problems now and in the future is vital. Chief Data Officers are often found asking themselves key questions like how to generate data with maximum reusability, while also making sure it’s as accurate and relevant as possible. Defining Information Management tools Different business units across the globe tend to use different tools, technologies, and systems to work on, store, and share information on an enterprise level. This greatly affects a company’s ability to access and leverage data for effective decision making and various other duties. A Chief Data Officer is responsible for establishing data-oriented standards across the business and for driving all arms of the business to comply with the standards and embrace change, to ensure the integrity of data. Spotting new opportunities A Chief Data Officer is responsible for spotting new opportunities where the business can venture into through careful analysis of data and past records. For example, a motor company would leverage certain sales information to make an informed decision on what age group to target with their new SUV in-the-making, to maximize sales. These are just the tip of the iceberg when it comes to a Chief Data Officer’s responsibilities. Responsibilities go hand-in-hand with skill and traits required to execute those responsibilities. Below are some key capabilities sought after in a CDO. Key Chief Data Officer Skills The right person for the job is expected to possess impeccable leadership and C-Suite level communication skills, as well as strong business acumen. They are expected to have strong knowledge of GDPR software tools and solutions to enable the organizations hiring them to swiftly transform and adopt the new regulations. They are also expected to possess knowledge of IT architecture, including a familiarity with leading architectural standards such as TOGAF or the Zachman framework. They need to be experienced in driving data governance as well as data quality and integrity, while also possessing a strong knowledge of data analytics, visualization, and storytelling. Familiar with Big Data solutions like Hadoop, MapReduce, and HBase is a plus. How much do Chief Data Officers earn?  Now, let’s take a look at what kind of compensation a Chief Data Officer is likely to be offered. To tell you the truth, the answer to this question is still a bit hazy, but it’s sure to pick up speed with the recent developments in the regulatory and legal areas related to data. About a year ago, a blog post from careeraddict revealed that the salary for a CDO in the US was around $112,000 annually. A job listing seen on Indeed quoted $200,000 as the annual salary. Indeed shows 7 jobs posted for a CDO in the last 15 days. We took the estimated salaries of CDOs and compared them with those of CIOs and CTOs in the same company. It turns out that most were on par, with a few CDO compensations falling slightly short of the CIO and CTO salary. These are just basic salary figures. Bonuses add on the side, amounting up to 50% in some cases. However, please note that these salary figures vary heavily based on the type of organization and the industry. Do businesses even need a Chief Data Officer? One might argue that some of the skills expected of a Chief Data Officer would also be held by the CIO or the Chief Digital Officer of the organization or the Data Protection Officer (if they have one). Then, why have a Chief Data Officer at all and incur an extra significant cost to the company? With the rapid change in tech and the rate at which data is generated, used, and discarded, most data pointers, point in the direction of having a separate Chief Data Officer, working alongside the CIO. It’s critical to have a clearly defined need for both roles to co-exist. Blurring the boundaries of the two roles can be detrimental and organizations must, therefore, be painstakingly mindful of the defined KRAs. The organisation should clearly define the two roles to keep the business structure running smoothly. A Chief Data Officer’s main focus will be on the latest data-centric technological innovations, their compliance to the new standards while also boosting customer engagement, privacy and in turn, loyalty and the business’s competitive advantage. The CIO, on the other hand, focuses on improving the bottom line by owning business productivity metrics, cost-cutting initiatives, making IT investments etc. – i.e., an inward facing data-management and architecture role. The CIO is the person who is therefore responsible for leading digital initiatives at a board level. In addition to managing data and governing information, if the CIO’s responsibilities were to also include implementing analytics in fresh ways to generate value for the business, it is going to be a tall order. To put it simply, it is more practical for the CIO to own the systems and the CDO to oversee all the bits and bytes that flow through these systems. Moreover, in several cases, the Chief Data Officer will act as a liaison between the business and IT. Thus, Chief Data Officers and CIOs both need to work together and support each other for a better functioning business. The bottom line: a Chief Data Officer is essential For an organization dealing with a lot of data, a Chief Data Officer is a must. Failure to have one on board can result in being fined €10 million euros or 2% of the organization’s worldwide turnover (depending on which is higher). Here are the criteria for an organization to have a dedicated personnel managing Data Protection. The organization’s core activities should: Have data processing operations which require regular and systematic monitoring of data subjects on a large scale or monitoring of individuals Be processing a large scale of special categories of data (i.e. sensitive data such as health, religion, race, sexual orientation etc.) Have data processing carried out by a public authority or a body processing personal data, except for courts operating in their judicial capacity Apart from this mandate, a Chief Data Officer can add immense value by aligning data-driven insights with its vision and goals. A CDO can bridge the gap between the CMO and the CIO, by focusing on meeting customer requirements through data-driven products. For those in data and insights centric roles such as data scientists, data engineers, data analysts and others, the CDO is a natural destination for their career progression journey. The Chief Data Officer role is highly attractive in terms of the scope of responsibilities, the capabilities and of course, the pay. Certification courses like this one are popping up to help individuals shape themselves for the role. All-in-all, this new C-suite position in most organizations, is the perfect pivot between old and new, bridging silos, and making a future where data privacy is intact.
Read more
  • 0
  • 0
  • 19628

article-image-amazon-sagemaker-machine-learning-cloud-easy
Amey Varangaonkar
12 Apr 2018
5 min read
Save for later

Amazon Sagemaker makes machine learning on the cloud easy

Amey Varangaonkar
12 Apr 2018
5 min read
Amazon Sagemaker was launched by Amazon back in November 2017. It was built with the promise of simplifying machine learning on the cloud. The software was a response not only to the increasing importance of machine learning, but also the fact that there is a demand to perform machine learning in the cloud. Amazon Sagemaker is clearly a smart move by Amazon that will consolidate the dominance of AWS in the cloud market. What is Amazon Sagemaker? Amazon Sagemaker is Amazon’s premium cloud-based service which serves as a platform for machine learning developers and data scientists to build, train and deploy machine learning models on the cloud. One of the features that makes Sagemaker stand out from the rest is that it is business-ready. This means machine learning models can be optimized for high performance and deployed at scale to work on data with varying sizes and complexity. The basic intention of Sagemaker, as Vogels mentioned in his keynote, is to remove any barriers that slow down the machine learning process for developers. In a standard machine learning process, a developer spends most of the time doing the following standard tasks: Collecting, cleaning and preparing the training data set. Selecting the most appropriate algorithm for the machine learning problem Training the model for accurate prediction Optimizing the model’s performance Integrating the model with the application Deploying the application to production Most of these tasks require a lot of expertise, and more importantly, time and efforts. Not to mention the computational resources such as storage space and processing memory. The larger the dataset, the bigger this problem becomes. Amazon Sagemaker removes these complexities by providing a solid platform with built-in modules that can be used together or individually to complete each of the above tasks with relative ease. How Amazon Sagemaker Works Amazon Sagemaker offers a lot of options for machine learning developers to train and optimize their machine learning models to work at scale. For starters, Sagemaker comes integrated with hosted Jupyter notebooks to allow developers to visually explore and analyze their dataset. You can also move your data directly from popular Amazon databases such as RDS, DynamoDB and Redshift into S3 and conduct your analysis there. The simple block diagram below demonstrates the core working of Amazon Sagemaker: Amazon Sagemaker includes 12 high performance, production-ready algorithms which can be used to build and deploy models at scale. Some of the popular ones include k-means clustering, Principal Component Analysis (PCA), neural topic modeling, and more. It comes pre-configured with popular machine learning and deep learning frameworks such as Tensorflow, PyTorch, Apache MXNet and more, but you can also use your own framework without any hassle. Once your model is trained, Sagemaker makes use of the AWS’ auto-scaled clusters to deploy the model, making sure the model doesn’t lack in performance and is highly available at all times. Not just that, Sagemaker also includes built-in testing capabilities for you to test and check your model for any issues, before it can be deployed for production. Benefits of using Amazon Sagemaker Business are likely to adopt Amazon Sagemaker, mainly because of the fact that it makes the whole machine learning process so effortless. With Sagemaker, it becomes very easy to build and deploy smarter applications that give accurate predictions, and thereby help increase the business profitability. Significantly reduces time: With built-in modules, Sagemaker significantly reduces the time required to do a variety of machine learning tasks, and the models can be deployed to production in very little time. This is important for businesses, as near-real time insights obtained from smart applications help them optimize their processes quickly, and effectively get an edge over their competition. Effortless and more productive machine learning: By virtue of the one-click training and deployment feature offered by Sagemaker, machine learning engineers and developers can now focus on asking the right questions of the data, and focus on the results rather than the process. They can also devote more time to optimizing the model rather than focusing on collecting and cleaning the data, which takes up most of their time. Flexibility in using the algorithms and frameworks: With Sagemaker, developers have the freedom to choose the best-possible algorithm and tool for performing machine learning effectively. Easy integration, access and optimization: The models trained using Sagemaker can be integrated into an existing business application seamlessly, and are optimized for speed and high performance. Backed by the computational power of AWS, business can rest assured their applications will continue to perform optimally without any risk of failure. Sagemaker - Amazon’s answer to Cloud Auto ML In a 3-way cloud war between Google, Microsoft and Amazon, it is clear Google and Amazon are trying to go head to head in order to establish their supremacy in the market, especially in the AI space. Sagemaker is Amazon’s answer to Google’s Cloud Auto ML, which was made publicly available in January, and delivers a similar promise - making machine learning easier than ever for developers. With Amazon serving a large customer-base, a platform like Sagemaker helps them to create a system that runs at scale and handles vast amounts of data quite effortlessly.  Amazon is yet to release any technical paper on how Sagemaker’s streaming algorithms work, but that will certainly be something to look out for in the near future. Considering Amazon identifies AI as key to their future product development, to think of Sagemaker as a better, more complete cloud service which also has deep learning capabilities is definitely not far-fetched.
Read more
  • 0
  • 0
  • 15058

article-image-brickcoin-might-change-your-mind-about-cryptocurrencies
Savia Lobo
11 Apr 2018
3 min read
Save for later

BrickCoin might just change your mind about cryptocurrencies

Savia Lobo
11 Apr 2018
3 min read
At the start of 2018, the cryptocurrency boom seemed to be at an end. Bitcoin's price plunged in just a matter of weeks from a mid-December 2017 high of $20,000 to less than $10,000. Suddenly everything seemed unpredictable and volatile. The cryptocurrency honeymoon was at an end. However, while many are starting to feel cautious about investing, a new cryptocurrency might change the game. BrickCoin might well be the cryptocurrency to reinvigorate a world that's shifted from optimism to pessimism in just a couple of months. But what is BrickCoin? And how is it different from other cryptocurrencies? Most importantly, why might you be confident in its success? What is BrickCoin? This one’s also a Blockchain based currency, but one backed with real estate(REITs). BrickCoin aims to be the first regulated, KYC, and AML compliant real estate crypto. The real estate is comprehensive to regulators and is accepted by many as a robust asset class. This is a major distinguishing point between BrickCoin and other cryptocurrencies. Traditional money saving methods - savings account and fixed deposits - are not inflation-proof. They also have very low levels of interest at the moment. On the other hand, complex investment options such as hedge funds are typically only available to wealthy individuals as they require large initial investments. These also do not offer ready liquidity and are vulnerable to bankruptcy. BrickCoin comes to the rescue here, as it claims to be Inflation proof. Find out more about BrickCoin here. The key features of BrickCoin It is a savings token which can be bought with traditional currency or digital currency. It represents an investment in a piece of commercial debt-free real estate. The real estate is held as part of a very secure, high-value, debt-free REIT. BrickCoins are kept in a mobile digital wallet. All transactions are fully-managed, validated and trackable by blockchain technology. BrickCoins can be converted into FIAT currency instantly. [box type="note" align="" class="" width=""]Also read about CryptoML, a machine learning powered cryptocurrency platform.[/box] BrickCoin is essentially the next step in the evolution of cryptocurrency. It is a savings scheme that is backed by a non-inflationary asset - commercial debt-free real estate - to deliver stable capital preservation. As a cryptocurrency, it allows savers to convert their money to and from BrickCoin tokens using the full security and convenience of blockchain technology. BrickCoin will be the first cryptocurrency that bridges the gap between the necessary reliance on the FIAT currencies and the asset-backed wealth-creation opportunities that are often out of reach for many ordinary savers. Crypto News Cryptojacking is a growing cybersecurity threat, report warns Coinbase Commerce API launches Crypto-ML, a machine learning powered cryptocurrency platform Crypto Op-Ed There and back again: Decrypting Bitcoin`s 2017 journey from $1000 to $20000 Will Ethereum eclipse Bitcoin? Beyond the Bitcoin: How cryptocurrency can make a difference in hurricane disaster relief Cryptocurrency Tutorials Predicting Bitcoin price from historical and live data How to mine bitcoin with your Raspberry Pi Protecting Your Bitcoins
Read more
  • 0
  • 0
  • 4630

article-image-everything-know-ethereum
Packt Editorial Staff
10 Apr 2018
8 min read
Save for later

Everything you need to know about Ethereum

Packt Editorial Staff
10 Apr 2018
8 min read
Ethereum was first conceived of by Vitalik Buterin in November 2013. The critical idea proposed was the development of a Turing-complete language that allows the development of arbitrary programs (smart contracts) for Blockchain and decentralized applications. This concept is in contrast to Bitcoin, where the scripting language is limited in nature and allows necessary operations only. This is an excerpt from the second edition of Mastering Blockchain by Imram Bashir. The following table shows all the releases of Ethereum starting from the first release to the planned final release: Version Release date Olympic May, 2015 Frontier July 30, 2015 Homestead March 14, 2016 Byzantium (first phase of Metropolis) October 16, 2017 Metropolis To be released Serenity (final version of Ethereum) To be released   The first version of Ethereum, called Olympic, was released in May, 2015. Two months later, a second version was released, called Frontier. After about a year, another version named Homestead with various improvements was released in March, 2016. The latest Ethereum release is called Byzantium. This is the first part of the development phase called Metropolis. This release implemented a planned hard fork at block number 4,370,000 on October 16, 2017. The second part of this release called Constantinople is expected in 2018 but there is no exact time frame available yet. The final planned release of Ethereum is called Serenity. It's planned for Serenity to introduce the final version of PoS based blockchain instead of PoW. The yellow paper The Yellow Paper, written by Dr. Gavin Wood, serves as a formal definition of the Ethereum protocol. Anyone can implement an Ethereum client by following the protocol specifications defined in the paper. While this paper is a challenging read, especially for those who do not have a background in algebra or mathematics, it contains a complete formal specification of Ethereum. This specification can be used to implement a fully compliant Ethereum client. The list of all symbols with their meanings used in the paper is provided here with the anticipation that it will make reading the yellow paper more accessible. Once symbol meanings are known, it will be much easier to understand how Ethereum works in practice. Symbol Meaning Symbol Meaning ≡ Is defined as ≤ Less than or equal to = Is equal to Sigma, World state ≠ Is not equal to Mu, Machine state ║...║ Length of Upsilon, Ethereum state transition function Is an element of Block level state transition function Is not an element of . Sequence concatenation For all There exists Union ᴧ Contract creation function Logical AND Increment : Such that Floor, lowest element {} Set Ceiling, highest element () Function of tuple No of bytes [] Array indexing Exclusive OR Logical OR (a ,b) Real numbers >= a and < b > Is greater than Empty set, null + Addition - Subtraction ∑ Summation { Describing various cases of if, otherwise   Ethereum blockchain Ethereum, like any other blockchain, can be visualized as a transaction-based state machine. This definition is referred to in the Yellow Paper. The core idea is that in Ethereum blockchain, a genesis state is transformed into a final state by executing transactions incrementally. The final transformation is then accepted as the absolute undisputed version of the state. In the following diagram, the Ethereum state transition function is shown, where a transaction execution has resulted in a state transition: In the example above, a transfer of two Ether from address 4718bf7a to address 741f7a2 is initiated. The initial state represents the state before the transaction execution, and the final state is what the morphed state looks like. Mining plays a central role in state transition, and we will elaborate the mining process in detail in the later sections. The state is stored on the Ethereum network as the world state. This is the global state of the Ethereum blockchain. How Ethereum works from a user's perspective For all the conversation around cryptocurrencies, it's very rare for anyone to actually explain how it works from the perspective of a user. Let's take a look at how it works in practice. In this example, I'll use the example of one man (Bashir) transferring money to another (Irshad). You may also want to read our post on if Ethereum will eclipse bitcoin. For the purposes of this example, we're using Jaxx wallet. However, you can use any cryptocurrency wallet for this. First, either a user requests money by sending the request to the sender, or the sender decides to send money to the receiver. The request can be sent by sending the receivers Ethereum address to the sender. For example, there are two users, Bashir and Irshad. If Irshad requests money from Bashir, then she can send a request to Bashir by using QR code. Once Bashir receives this request he will either scan the QR code or manually type in Irshad's Ethereum address and send Ether to Irshad's address. This request is encoded as a QR code shown in the following screenshot which can be shared via email, text or any other communication methods.2. Once Bashir receives this request he will either scan this QR code or copy the Ethereum address in the Ethereum wallet software and initiate a transaction. This process is shown in the following screenshot where the Jaxx Ethereum wallet software on iOS is used to send money to Irshad. The following screenshot shows that the sender has entered both the amount and destination address for sending Ether. Just before sending the Ether the final step is to confirm the transaction which is also shown here: Once the request (transaction) of sending money is constructed in the wallet software, it is then broadcasted to the Ethereum network. The transaction is digitally signed by the sender as proof that he is the owner of the Ether. This transaction is then picked up by nodes called miners on the Ethereum network for verification and inclusion in the block. At this stage, the transaction is still unconfirmed. Once it is verified and included in the block, the PoW process begins. Once a miner finds the answer to the PoW problem, by repeatedly hashing the block with a new nonce, this block is immediately broadcasted to the rest of the nodes which then verifies the block and PoW. If all the checks pass then this block is added to the blockchain, and miners are paid rewards accordingly. Finally, Irshad gets the Ether, and it is shown in her wallet software. This is shown here: On the blockchain, this transaction is identified by the following transaction hash: 0xc63dce6747e1640abd63ee63027c3352aed8cdb92b6a02ae25225666e171009e Details regarding this transaction can be visualized from the block explorer, as shown in the following screenshot: Thiswalkthroughh should give you some idea of how it works. Different Ethereum networks The Ethereum network is a peer-to-peer network where nodes participate in order to maintain the blockchain and contribute to the consensus mechanism. Networks can be divided into three types, based on requirements and usage. These types are described in the following subsections. Mainnet Mainnet is the current live network of Ethereum. The current version of mainnet is Byzantium (Metropolis) and its chain ID is 1. Chain ID is used to identify the network. A block explorer which shows detailed information about blocks and other relevant metrics is available here. This can be used to explore the Ethereum blockchain. Testnet Testnet is the widely used test network for the Ethereum blockchain. This test blockchain is used to test smart contracts and DApps before being deployed to the production live blockchain. Because it is a test network, it allows experimentation and research. The main testnet is called Ropsten which contains all features of other smaller and special purpose testnets that were created for specific releases. For example, other testnets include Kovan and Rinkeby which were developed for testing Byzantium releases. The changes that were implemented on these smaller testnets has also been implemented on Ropsten. Now the Ropsten test network contains all properties of Kovan and Rinkeby. Private net As the name suggests, this is the private network that can be created by generating a new genesis block. This is usually the case in private blockchain distributed ledger networks, where a private group of entities start their blockchain and use it as a permissioned blockchain. The following table shows the list of Ethereum network with their network IDs. These network IDs are used to identify the network by Ethereum clients. Network name Network ID / Chain ID Ethereum mainnet 1 Morden 2 Ropsten 3 Rinkeby 4 Kovan 42 Ethereum Classic mainnet 61   You should now have a good foundation of knowledge to get started with Ethereum. To learn more about Ethereum and other cryptocurrencies, check out the new edition of Mastering Blockchain. Other posts from this book A brief history of Blockchain Write your first Blockchain: Learning Solidity Programming in 15 minutes 15 ways to make Blockchains scalable, secure and safe! What is Bitcoin
Read more
  • 0
  • 0
  • 16313

article-image-unity-plugins-for-augmented-reality-app-development
Sugandha Lahoti
10 Apr 2018
4 min read
Save for later

Unity plugins for augmented reality application development

Sugandha Lahoti
10 Apr 2018
4 min read
Augmented Reality is the powerhouse for the next set of magic tricks headed to our mobile devices.  Augmented Reality combines real-world objects with Digital information. Heard about Pokemon Go? It was first showcased by Niantic at WWDC 2017 and was built on Apple’s augmented reality framework, ARKit. Following the widespread success of Pokemon Go, a large number of companies are eager to invest in AR technology. Unity is one of the dominant players in the industry when it comes to creating desktop, console and mobile games. Augmented Reality has been exciting game developers for quite some time now, and following this excitement Unity has released prominent tools for developers to experiment with AR Apps. Bear in mind that Unity is not designed exclusively for Augmented Reality and so developers can access additional functionality by importing extensions. These extensions also provide pre-designed game components such as characters or game props. Let us briefly look at 3 prominent tools or extensions for Augmented Reality development provided by Unity: Unity ARKit plugin The Unity ARKit plugin uses the functionality of the ARKit SDK within Unity projects. As on September 2017, this plugin is also extended for iOS apps as iOS ARKit plugin. The ARKit plugin provides Unity developers with access to features such as motion tracking, vertical and horizontal plane finding, live video rendering, hit-testing, raw point cloud data, ambient light estimation, and more for their AR projects. This plugin also provides easy integration of AR features in existing Unity projects. A new tool, the Unity ARKit Remote speeds up iteration by allowing developers to make real-time changes to the scene and debug scripts in the Unity Editor. The latest update to iOS ARKit is version 1.5 which provides developers with the more tools to power more immersive AR experiences. Google ARCore Google ARCore for Unity provides mobile AR experiences for Android, without the need for additional hardware. The latest major version ARCore 1.0 enables AR applications to track a phone’s motion in the real world, detect planes in the environment, and understand lighting in the camera scene. ARCore 1.0 introduces featured oriented points which help in the placement of anchors on textured surfaces. These feature points enhance the environmental understanding of the scene. So ARCore is not just limited to horizontal and vertical planes like ARKit, but can create AR Apps on any surface. ARCore 1.0 is supported by the Android Emulator in Android Studio 3.1 Beta and is available for use on multiple supported Android devices. Vuforia integration with Unity Vuforia allows developers to build cross-platform AR apps directly from the Unity editor. It provides Augmented Reality support for Android, iOS, and UWP devices, through a single API. It attaches digital content to different types of objects and environments using Model Targets and Ground Plane, across a broad range of devices and operating systems. Ground Plane attaches digital content to horizontal surfaces. Model Targets provides Object Recognition capabilities. Other targets include Image (to put AR content on flat objects) and Cloud (manage large collections of Image Targets from your own CMS). Vuforia also includes Device Tracking capability which provides an inside-out device tracker for rotational head and hand tracking. It also provides APIs to create immersive experiences that transition between AR and VR. You can browse through various AR projects from the Unity community to help you get started with your next big AR idea as well as to choose the toolkit best suited for you. Leap Motion open sources its $100 augmented reality headset, North Star Unity and Unreal comparison Types of Augmented Reality targets Create Your First Augmented Reality Experience: The Tools and Terms You Need to Understand
Read more
  • 0
  • 0
  • 15899

article-image-brief-history-blockchain
Packt Editorial Staff
09 Apr 2018
6 min read
Save for later

A brief history of Blockchain

Packt Editorial Staff
09 Apr 2018
6 min read
History - where do we start? Blockchain was introduced with the invention of Bitcoin in 2008. Its practical implementation then occurred in 2009. Of course, both Blockchain and Bitcoin are very different, but you can't tell the full story behind the history of Blockchain without starting with Bitcoin. Electronic cash before Blockchain The concept of electronic cash or digital currency is not new. Since the 1980s, e-cash protocols have existed based on a model proposed by David Chaum. This is an extract from the new edition of Mastering Blockchain. Just as you need to understand the concept of distributed systems is to properly understand Blockchain, you also need to understand electronic cash. This concept pre-dates Blockchain and Bitcoin, but without it, we would certainly not be where we are today. Two fundamental e-cash system issues need to be addressed: accountability and anonymity. Accountability is required to ensure that cash is spendable only once (double-spend problem) and that it can only be spent by its rightful owner. Double spend problem arises when same money can be spent twice. As it is quite easy to make copies of digital data, this becomes a big issue in digital currencies as you can make many copies of same digital cash. Anonymity is required to protect users' privacy. As with physical cash, it is almost impossible to trace back spending to the individual who actually paid the money. David Chaum solved both of these problems during his work in the 1980s by using two cryptographic operations, namely blind signatures and secret sharing. Blind signatures allow for signing a document without actually seeing it, and secret sharing is a concept that enables the detection of double spending, that is using the same e-cash token twice (double spending). In 2009, the first practical implementation of an electronic cash (e-cash) system named Bitcoin appeared. The term cryptocurrency emerged later. For the very first time, it solved the problem of distributed consensus in a trustless network. It used public key cryptography with a Proof of Work (PoW) mechanism to provide a secure, controlled, and decentralized method of minting digital currency. The key innovation was the idea of an ordered list of blocks composed of transactions and cryptographically secured by the PoW mechanism. Other technologies that used something like a precursor to Bitcoin, include Merkle trees, hash functions, and hash chains. Looking at all the technologies mentioned earlier and their relevant history, it is easy to see how concepts from electronic cash schemes and distributed systems were combined to create Bitcoin and what now is known as Blockchain. This concept can also be visualized with the help of the following diagram: Blockchain and Sakoshi Nakamoto In 2008, a groundbreaking paper entitled Bitcoin: A Peer-to-Peer Electronic Cash System was written on the topic of peer-to-peer electronic cash under the pseudonym Satoshi Nakamoto. It introduced the term chain of blocks. No one knows the actual identity of Satoshi Nakamoto. After introducing Bitcoin in 2009, he remained active in the Bitcoin developer community until 2011. He then handed over Bitcoin development to its core developers and simply disappeared. Since then, there has been no communication from him whatsoever, and his existence and identity are shrouded in mystery. The term chain of blocks evolved over the years into the word Blockchain. Since that point, the history of Blockchain is really the history of its application in different industries. The most notable area is unsurprisingly within finance. Blockchain has been shown to improve the speed and security of financial transactions. While it hasn't yet become embedded in the mainstream of the financial sector, it surely only remains a matter of time before it begins to take hold. How it has evolved in recent years In Blockchain: Blueprint for a New Economy, Melanie Swann identifies three different tiers of Blockchain. These three tiers all showcase how Blockchain is currently evolving. It's worth noting that these various tiers or versions aren't simple chronological points in the history of Blockchain. The lines between each are blurred, and it ultimately depends on how Blockchain technology is being applied that different features and capabilities will be appear. Blockchain 1.0: This tier was introduced with the invention of Bitcoin, and it is primarily used for cryptocurrencies. Also, as Bitcoin was the first implementation of cryptocurrencies, it makes sense to categorize this first generation of Blockchain technology to include only cryptographic currencies. All alternative cryptocurrencies, as well as Bitcoin, fall into this category. It includes core applications such as payments and applications. This generation started in 2009 when Bitcoin was released and ended in early 2010. Blockchain 2.0: This second Blockchain generation is used by financial services and smart contracts. This tier includes various financial assets, such as derivatives, options, swaps, and bonds. Applications that go beyond currency, finance, and markets are incorporated at this tier. Ethereum, Hyperledger, and other newer Blockchain platforms are considered part of Blockchain 2.0. This generation started when ideas related to using blockchain for other purposes started to emerge in 2010. Blockchain 3.0: This third Blockchain generation is used to implement applications beyond the financial services industry and is used in government, health, media, the arts, and justice. Again, as in Blockchain 2.0, Ethereum, Hyperledger, and newer blockchains with the ability to code smart contracts are considered part of this blockchain technology tier. This generation of Blockchain emerged around 2012 when multiple applications of Blockchain technology in different industries were researched. Blockchain X.0: This generation represents a vision of Blockchain singularity where one day there will be a public Blockchain service available that anyone can use just like the Google search engine. It will provide services for all realms of society. It will be a public and open distributed ledger with general-purpose rational agents (Machina economicus) running on a Blockchain, making decisions, and interacting with other intelligent autonomous agents on behalf of people, and regulated by code instead of law or paper contracts. This does not mean that law and contracts will disappear, instead, law and contracts will be implementable in code. Like any history, this history of Blockchain isn't exhaustive. But it does hopefully give you an idea of how it has developed to where we are today. Check out this tutorial to write your first Blockchain program.
Read more
  • 0
  • 0
  • 15294
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-types-augmented-reality-targets
Aarthi Kumaraswamy
08 Apr 2018
6 min read
Save for later

Types of Augmented Reality targets

Aarthi Kumaraswamy
08 Apr 2018
6 min read
The essence of Augmented Reality is that your device recognizes objects in the real world and renders the computer graphics registered to the same 3D space, providing the illusion that the virtual objects are in the same physical space with you. Since augmented reality was first invented decades ago, the types of targets the software can recognize has progressed from very simple markers for images and natural feature tracking to full spatial map meshes. There are many AR development toolkits available; some of them are more capable than others of supporting a range of targets. The following is a survey of various Augmented Reality target types. We will go into more detail in later chapters, as we use different targets in different projects. Marker The most basic target is a simple marker with a wide border. The advantage of marker targets is they're readily recognized by the software with very little processing overhead and minimize the risk of the app not working, for example, due to inconsistent ambient lighting or other environmental conditions. The following is the Hiro marker used in example projects in ARToolkit: Coded Markers Taking simple markers to the next level, areas within the border can be reserved for 2D barcode patterns. This way, a single family of markers can be reused to pop up many different virtual objects by changing the encoded pattern. For example, a children's book may have an AR pop up on each page, using the same marker shape, but the bar code directs the app to show only the objects relevant to that page in the book. The following is a set of very simple coded markers from ARToolkit: Vuforia includes a powerful marker system called VuMark that makes it very easy to create branded markers, as illustrated in the following image. As you can see, while the marker styles vary for specific marketing purposes, they share common characteristics, including a reserved area within an outer border for the 2D code: Images The ability to recognize and track arbitrary images is a tremendous boost to AR applications as it avoids the requirement of creating and distributing custom markers paired with specific apps. Image tracking falls into the category of natural feature tracking (NFT). There are characteristics that make a good target image, including having a well-defined border (preferably eight percent of the image width), irregular asymmetrical patterns, and good contrast. When an image is incorporated in your AR app, it's first analyzed and a feature map (2D node mesh) is stored and used to match real-world image captures, say, in frames of video from your phone. Multi-targets It is worth noting that apps may be set up to see not just one marker in view but multiple markers. With multitargets, you can have virtual objects pop up for each marker in the scene simultaneously. Similarly, markers can be printed and folded or pasted on geometric objects, such as product labels or toys. The following is an example cereal box target: Text recognition If a marker can include a 2D bar code, then why not just read text? Some AR SDKs allow you to configure your app (train) to read text in specified fonts. Vuforia goes further with a word list library and the ability to add your own words. Simple shapes Your AR app can be configured to recognize basic shapes such as a cuboid or cylinder with specific relative dimensions. Its not just the shape but its measurements that may distinguish one target from another: Rubik's Cube versus a shoe box, for example. A cuboid may have width, height, and length. A cylinder may have a length and different top and bottom diameters (for example, a cone). In Vuforia's implementation of basic shapes, the texture patterns on the shaped object are not considered, just anything with a similar shape will match. But when you point your app to a real-world object with that shape, it should have enough textured surface for good edge detection; a solid white cube would not be easily recognized. Object recognition The ability to recognize and track complex 3D objects is similar but goes beyond 2D image recognition. While planar images are appropriate for flat surfaces, books or simple product packaging, you may need object recognition for toys or consumer products without their packaging. Vuforia, for example, offers Vuforia Object Scanner to create object data files that can be used in your app for targets. The following is an example of a toy car being scanned by Vuforia Object Scanner: Spatial maps Earlier, we introduced spatial maps and dynamic spatial location via SLAM. SDKs that support spatial maps may implement their own solutions and/or expose access to a device's own support. For example, the HoloLens SDK Unity package supports its native spatial maps, of course. Vuforia's spatial maps (called Smart Terrain) does not use depth sensing like HoloLens; rather, it uses visible light camera to construct the environment mesh using photogrammetry. Apple ARKit and Google ARCore also map your environment using the camera video fused with other sensor data. Geolocation A bit of an outlier, but worth mentioning, AR apps can also use just the device's GPS sensor to identify its location in the environment and use that information to annotate what is in view. I use the word annotate because GPS tracking is not as accurate as any of the techniques we have mentioned, so it wouldn't work for close-up views of objects. But it can work just fine, say, standing atop a mountain and holding your phone up to see the names of other peaks within the view or walking down a street to look up Yelp! reviews of restaurants within range. You can even use it for locating and capturing Pokémon. [box type="note" align="" class="" width=""]You read an excerpt from the book, Augmented Reality for Developers, by Jonathan Linowes, and Krystian Babilinski. To learn how to use these targets and to build a variety of AR apps, check the book now![/box]
Read more
  • 0
  • 0
  • 21499

article-image-why-oracle-losing-database-race
Aaron Lazar
06 Apr 2018
3 min read
Save for later

Why Oracle is losing the Database Race

Aaron Lazar
06 Apr 2018
3 min read
When you think of databases, the first thing that comes to mind is Oracle or IBM. Oracle has been ruling the database world for decades now, and it has been able to acquire tonnes of applications that use its databases. However, that’s changing now, and if you didn’t know already, you might be surprised to know that Oracle is losing the database race. Oracle = Goliath Oracle was and still is ranked number one among databases, owing to its legacy in the database ballpark. Source - DB Engines The main reason why Oracle has managed to hold its position is because of lock-in, a CIO’s worst nightmare. Migrating data that’s accumulated over the years is not a walk in the park and usually has top management flinching every time it’s mentioned. Another reason is because Oracle is known to be aggressive when it comes to maintaining and enforcing licensing terms. You won’t be surprised to find Oracle ‘agents’ at the doorstep of your organisation, slapping you with a big fine for non-compliance! Oracle != Goliath for everyone You might wonder whether even the biggies are in the same position, locked-in with Oracle. Well, the Amazons and Salesforces of the world have quietly moved away from lock-in hell and have their applications now running on open-source projects. In fact, Salesforce plans to be completely free of Oracle databases by 2023 and has even codenamed this project “Sayonara”. I wonder what inspired the name! Enter the “Davids” of Databases While Oracle’s databases have been declining, alternatives like SQL Server and PostgreSQL have been steadily growing. SQL Server has been doing it in leaps and bounds, with a growth rate of over 30%. Amazon and Microsoft’s cloud based databases have seen close to 10x growth. While one might think that all Cloud solutions would have dominated the database world, databases like Google Cloud SQL and IBM Cognos have been suffering very slow to no growth as the question of lock-in arises again, only this time with a cloud vendor. MongoDB has been another shining star in the database race. Several large organisations like HSBC, Adobe, Ebay, Forbes and MTV have adopted MongoDB as their database solution. Newer organisations have been resorting to adopt these databases instead to looking to Oracle. However, it’s not really eating into Oracle’s existing market, at least not yet. Is 18c Oracle’s silver bullet? Oracle bragged a lot about 18c, last year, positioning it as a database that needs little to no human interference thanks to its ground-breaking machine learning; one that operates at less than 30 minutes of downtime a year and many more features. Does this make Microsoft and Amazon break into a sweat? Hell no! Although Oracle has strategically positioned 18c as a database that lowers operational cost by cutting down on the human element, it still is quite expensive when compared to its competitors - they haven’t dropped their price one bit. Moreover, it can’t really automate “everything” and there’s always a need for a human administrator - not really convincing enough. Quite naturally customers will be drawn towards competition. In the end, the way I look at it, Oracle already had a head start and is now inches from the elusive finish line, probably sniggering away at all the customers that it has on a leash. All while cloud databases are slowly catching up and will soon be leaving Oracle in a heap of dirt. Reminds me of that fable mum used to read to me...what’s it called...The hare and the tortoise.
Read more
  • 0
  • 0
  • 27296

article-image-top-10-computer-vision-tools
Aaron Lazar
05 Apr 2018
7 min read
Save for later

Top 10 Tools for Computer Vision

Aaron Lazar
05 Apr 2018
7 min read
The adoption of Computer Vision has been steadily picking up pace over the past decade, but there’s been a spike in adoption of various computer vision tools in recent times, thanks to its implementation in fields like IoT, manufacturing, healthcare, security, etc. Computer vision tools have evolved over the years, so much so that computer vision is now also being offered as a service. Moreover, the advancements in hardware like GPUs, as well as machine learning tools and frameworks make computer vision much more powerful in the present day. Major cloud service providers like Google, Microsoft and AWS have all joined the race towards being the developers’ choice. But which tool should you choose? Today I’ll take you through a list of the top tools and will help you understand which one to pick up, based on your need. Computer Vision Tools/Libraries OpenCV: Any post on computer vision is incomplete without the mention of OpenCV. OpenCV is a great performing computer vision tool and it works well with C++ as well as Python. OpenCV is prebuilt with all the necessary techniques and algorithms to perform several image and video processing tasks. It’s quite easy to use and this makes it clearly the most popular computer vision library on the planet! It is multi-platform, allowing you to build applications for Linux, Windows and Android. At the same time, it does have some drawbacks. It gets a bit slow when working through massive data sets or very large images. Moreover, on its own, it doesn’t have GPU support and relies on CUDA for GPU processing. Matlab: Matlab is a great tool for creating image processing applications and is widely used in research. The reason being that Matlab allows quick prototyping. Another interesting aspect is that Matlab code is quite concise, as compared to C++, making it easier to read and debug. It tackles errors before execution by proposing some ways to make the code faster. On the downside, Matlab is a paid tool. Also, it can get quite slow during execution time, if that’s something that concerns you much. Matlab is not your go to tool in an actual production environment, as it was basically built for prototyping and research. AForge.NET/Accord.NET: You’ll be excited to know that image processing is possible even if you’re a C# and .NET developer, thanks to AForge/Accord. It’s a great tool that has a lot of filters and is great for image manipulation and different transforms. The Image Processing Lab allows for filtering capabilities like edge detection and more. AForge is extremely simple to use as all you need to do is adjust parameters from a user interface. Moreover, its processing speeds are quite good. However, AForge doesn’t possess the power and capabilities of other tools like OpenCV, like advanced motion picture analysis or even advanced processing on images. TensorFlow: TensorFlow has been gaining popularity over the past couple of years, owing to its power and ease of use. It lets you bring the power of Deep Learning to computer vision and has some great tools to perform image processing/classification - it’s API-like graph tensor. Moreover, you can make use of the Python API to perform face and expression detection. You can also perform classification using techniques like regression. Tensorflow also allows you to perform computer vision of tremendous magnitudes. One of the main drawbacks of Tensorflow is that it’s extremely resource hungry and can devour a GPU’s capabilities in no time, quite uncalled for. Moreover, if you wanted to learn how to perform image processing with TensorFlow, you’d have to understand what Machine and Deep Learning is, write your own algorithms and then go forward from there. CUDA: CUDA is a platform for parallel computing, invented by NVIDIA. It enables great boosts in computing performance by leveraging the power of GPUs. The CUDA Toolkit includes the NVIDIA Performance Primitives library which is a collection of signal, image, and video processing functions. If you have large images to process, that are GPU intensive, you can choose to use CUDA. CUDA is easy to program and is quite efficient and fast. On the downside, it is extremely high on power consumption and you will find yourself reformulating for memory distribution in parallel tasks. SimpleCV: SimpleCV is a framework for building computer vision applications. It gives you access to a multitude of computer vision tools on the likes of OpenCV, pygame, etc. If you don’t want to get into the depths of image processing and just want to get your work done, this is the tool to get your hands on. If you want to do some quick prototyping, SimpleCV will serve you best. Although, if your intention is to use it in heavy production environments, you cannot expect it to perform on the level of OpenCV. Moreover, the community forum is not very active and you might find yourself running into walls, especially with the installation. GPUImage: GPUImage is a framework or rather, an iOS library that allows you to apply GPU-accelerated effects and filters to images, live motion video, and movies. It is built on OpenGL ES 2.0. Running custom filters on a GPU calls for a lot of code to set up and maintain. GPUImage cuts down on all of that boilerplate and gets the job done for you. Computer Vision as a Service: Google Cloud and Mobile Vision APIs: Google Cloud Vision API enables developers to perform image processing by encapsulating powerful machine learning models in a simple REST API that can be called in an application. Also, its Optical Character Recognition (OCR) functionality enables you to detect text in your images. The Mobile Vision API lets you detect objects in photos and video, using real-time on-device vision technology. It also lets you scan and recognise barcodes and text. Amazon Rekognition: Amazon Rekognition is a deep learning-based image and video analysis service that makes adding image and video analysis to your applications, a piece of cake. The service can identify objects, text, people, scenes and activities, and it can also detect inappropriate content, apart from providing highly accurate facial analysis and facial recognition for sentiment analysis. Microsoft Azure Computer Vision API: Microsoft’s API is quite similar to its peers and allows you to analyse images, read text in them, and analyse video in near-real time. You can also flag adult content, generate thumbnails of images and recognise handwriting. Bonus: SciPy and NumPy: I thought I’d add these in as well, since I’ve seen quite a few developers use Python to build computer vision applications (without OpenCV, that is). SciPy and NumPy are quite powerful enough to perform image processing. scikit-image is a Python package that is dedicated towards image processing, which uses native NumPy and SciPy arrays as image objects. Moreover, you get to use the cool IPython interactive computing environment and you can also choose to include OpenCV if you want to do some more hardcore image processing. Well there you have it, these were the top tools for computer vision and image processing. Head on over and check out these resources, to get working with some of the top tools used in the industry.
Read more
  • 0
  • 2
  • 39492

article-image-top-5-programming-languages-big-data
Amey Varangaonkar
04 Apr 2018
8 min read
Save for later

Top 5 programming languages for crunching Big Data effectively

Amey Varangaonkar
04 Apr 2018
8 min read
One of the most important decisions that Big Data professionals have to make, especially the ones who are new to the scene or are just starting out, is choosing the best programming languages for big data manipulation and analysis. Understanding the Big Data problem and framing the architecture to solve it is not quite enough these days - the execution needs to be perfect as well, and choosing the right language goes a long way. The best languages for big data In this article, we look at the 5 of the most popularly used - not to mention highly effective - programming languages for developing Big Data solutions. Scala A beautiful crossover of the object-oriented and functional programming paradigms, Scala is fast and robust, and a popular choice of language for many Big Data professionals.The fact that two of the most popular Big Data processing frameworks in Apache Spark and Apache Kafka have been built on top of Scala tells you everything you need to know about the power of Scala. Scala runs on the JVM, which means the codes written in Scala can be easily used within a Java-based Big Data ecosystem. One significant factor that differentiates Scala from Java, though, is that Scala is a lot less verbose in comparison. You can write 100s of lines of confusing-looking Java code in less than 15 lines in Scala. One negative aspect of Scala, though, is its steep learning curve when compared to languages like Go and Python, and this may put off beginners looking to use it. Why use Scala for big data? Fast and robust Suitable for working with Big Data tools like Apache Spark for distributed Big Data processing JVM compliant, can be used in a Java-based ecosystem Python Python has been declared as one of the fastest growing programming languages in 2018 as per the recently held Stack Overflow Developer Survey. Its general-purpose nature means it can be used across a broad spectrum of use-cases, and Big Data programming is one major area of application. Many libraries for data analysis and manipulation which are increasingly being used in a Big Data framework to clean and manipulate large chunks of data, such as pandas, NumPy, SciPy - are all Python-based. Not just that, most popular machine learning and deep learning frameworks such as scikit-learn, Tensorflow and many more, are also written in Python and are finding increasing application within the Big Data ecosystem. One drawback of using Python, and a reason why it is not a first-class citizen when it comes to Big Data programming yet, is that it’s slow. Although very easy to use, Big Data professionals have found systems built with languages such as Java or Scala faster and more robust to use than the systems built with Python. However, Python makes up for this limitation with other qualities. As Python is primarily a scripting language, interactive coding and development of analytical solutions for Big Data becomes very easy. Python can integrate effortlessly with the existing Big Data frameworks such as Apache Hadoop and Apache Spark, allowing you to perform predictive analytics at scale without any problem. Why use Python for big data? General-purpose Rich libraries for data analysis and machine learning Easy to use Supports iterative development Rich integration with Big Data tools Interactive computing through Jupyter notebooks R It won’t come as a surprise to many that those who love statistics, love R. The ‘language of statistics’ as it is popularly called as, R is used to build data models which can be used for effective and accurate data analysis. Powered by a large repository of R packages (CRAN, also called as Comprehensive R Archive Network), with R you have just about every type of tool to accomplish any task in Big Data processing - right from analysis to data visualization. R can be integrated seamlessly with Apache Hadoop and Apache Spark, among other popular frameworks, for Big Data processing and analytics. One issue with using R as a programming language for Big Data is that it is not very general-purpose. It means the code written in R is not production-deployable and generally has to be translated to some other programming language such as Python or Java. That said, if your goal is to only build statistical models for Big Data analytics, R is an option you should definitely consider. Why use R for big data? Built for data science Support for Hadoop and Spark Strong statistical modeling and visualization capabilities Support for Jupyter notebooks Java Last, but not the least, there’s always the good old Java. Some of the traditional Big Data frameworks such as Apache Hadoop and all the tools within its ecosystem are all Java-based, and still in use today in many enterprises. Not to mention the fact that Java is the most stable and production-ready language among all the languages we have discussed so far! Using Java to develop your Big Data applications gives you the ability to use a large ecosystem of tools and libraries for interoperability, monitoring and much more, most of which have already been tried and tested. One major drawback of Java is its verbosity. The fact that you have to write hundreds of lines of codes in Java for a task which can written in barely 15-20 lines of code in Python or Scala, can turnoff many budding programmers. However, the introduction of lambda functions in Java 8 does make life quite easier. Java also does not support iterative development unlike newer languages like Python, and this is an area of focus for the future Java releases. Despite the flaws, Java remains a strong contender when it comes to the preferred language for Big Data programming because of its history and the continued reliance on the traditional Big Data tools and frameworks. Why use Java for big data? Traditional Big Data tools and frameworks are written in Java Stable and production-ready Large ecosystem of tried and tested tools and libraries Go Last but not the least, there’s Go - one of the fastest rising programming languages in recent times. Designed by a group of Google engineers who were frustrated with C++, we think Go is a good shout in this list - simply because of the fact that it powers so many tools used in the Big Data infrastructure, including Kubernetes, Docker and many more. Go is fast, easy to learn, and fairly easy to develop applications with, not to mention deploy them. More importantly, as businesses look at building data analysis systems that can operate at scale, Go-based systems are being used to integrate machine learning and parallel processing of data. It is also possible to interface other languages with Go-based systems with relative ease. Why use Go for big data? Fast, easy to use Many tools used in the Big Data infrastructure are Go-based Efficient distributed computing There are a few other languages you might want to consider - Julia, SAS and MATLAB being some major ones which are useful in their own right. However, when compared to the languages we talked about above, we thought they fell a bit short in some aspects - be it speed, efficiency, ease of use, documentation, or community support, among other things. Let’s take a quick look at the comparison table of all the languages we discussed above. Note that we have used the ✓ symbol for the best possible language/s to help you make an informed decision. This is just our view, and that’s not to say that the other languages are any worse! Scala Python R Java Go Speed ✓ ✓ ✓ Ease of use ✓ ✓ ✓ Quick Learning curve ✓ ✓ Data Analysis capability ✓ ✓ ✓ General-purpose ✓ ✓ ✓ ✓ Big Data support ✓ ✓ ✓ ✓ ✓ Interfacing with other languages ✓ ✓ ✓ Production-ready ✓ ✓ ✓ So...which language should you choose? To answer the question in short - it all depends on the use-case you want to develop. If your focus is hardcore data analysis which involves a lot of statistical computing, R would be your go-to language. On the other hand, if you want to develop streaming applications for your Big Data, Scala can be a preferable choice. If you wish to use Machine Learning to leverage your Big Data and build predictive models, Python will come to your rescue. Lastly, if you plan to build Big Data solutions using just the traditionally-available tools, Java is the language for you. You also have the option of combining the power of two languages to get a more efficient and powerful solution. For example, you can train your machine learning model in Python and deploy it on Spark in a distributed mode. Ultimately, it all depends on how efficiently your solution can function, and more importantly, how fast and accurate it is. Which language do you prefer for crunching your Big Data? Do let us know!
Read more
  • 0
  • 1
  • 31739
article-image-why-deepmind-open-sourced-sonnet
Sugandha Lahoti
03 Apr 2018
3 min read
Save for later

Why DeepMind made Sonnet open source

Sugandha Lahoti
03 Apr 2018
3 min read
DeepMind has always open sourced their projects with a bang. Last year, it announced that it is going to open source Sonnet, a library for quickly building neural network modules with Tensorflow. Deepmind shifted from Torch to Tensorflow as their choice of framework since early 2016, after it was acquired by Google in 2014. Why Sonnet if you have TensorFlow? Since adopting TensorFlow as the choice of their framework, DeepMind has enjoyed the flexibility and adaptiveness of TF for building higher-level frameworks. In order to build neural network modules with Tensorflow, they created a framework called Sonnet. Sonnet doesn’t typically replace TensorFlow; it just eases the process of constructing neural networks. Prior to Sonnet, DeepMind developers were forced to become intimately familiar with the underlying TensorFlow graphs in order to correctly architect its applications. With Sonnet, the creation of neural network components is quite easy as it first constructs Python objects which represent some part of a neural network, and then separately connect these objects into the TensorFlow computation graph. What makes Sonnet special? Sonnet uses Modules. Modules encapsulate elements of a neural network which in turn abstracts low-level aspects of TensorFlow applications. Sonnet enables developers to build their own Modules using a simple programming model. These Modules simplify the neural network training and can help to implement individual neural networks that can be combined to implement higher-level networks. Developers can also easily extend Sonnet by implementing their own modules. Using Sonnet, it becomes easier to switch between different models, allowing engineers to freely conduct experiments without worrying about hampering their entire projects. Why open source Sonnet? The announcement of Sonnet open sourcing came on April 7, 2017. Most people appreciated it as a move in the right direction. One of the focal purpose of DeepMind to open source Sonnet was to make the developer community to use Sonnet to take their own research forwards.  According to FossBytes, "DeepMind foresees Sonnet to be used by the community as a research propellant." With this open sourcing, the machine learning community can then more actively contribute back by utilizing Sonnet in their own projects. Moreover, if the community becomes accustomed and acquainted with DeepMind’s internal libraries, it will become easier for the DeepMind group to release other Machine learning models alongside research papers. Certain experienced developers also point out that using TensorFlow and Sonnet together is similar to using TensorFlow and Torch together, with a Reddit comment stating “DeepMind's trying to turn TensorFlow into Torch”. Nevertheless, open sourcing of Sonnet is seen as DeepMind’s part of their broader commitment to open source AI research. Also, as Sonnet is adopted by the community more similar frameworks are also likely to develop that make neural network construction easier using TensorFlow as the underlying runtime. Taking a further step towards democratization of machine learning and its subsidies. Sonnet is already available on GitHub and will be regularly updated by the DeepMind team to match the in-house version.
Read more
  • 0
  • 0
  • 17571

article-image-concurrency-programming-101-why-do-programmers-hang-by-a-thread
Aarthi Kumaraswamy
03 Apr 2018
4 min read
Save for later

Concurrency programming 101: Why do programmers hang by a thread?

Aarthi Kumaraswamy
03 Apr 2018
4 min read
A thread can be defined as an ordered stream of instructions that can be scheduled to run as such by operating systems. These threads, typically, live within processes, and consist of a program counter, a stack, and a set of registers as well as an identifier. These threads are the smallest unit of execution to which a processor can allocate time. Threads are able to interact with shared resources, and communication is possible between multiple threads. They are also able to share memory, and read and write different memory addresses, but therein lies an issue. When two threads start sharing memory, and you have no way to guarantee the order of a thread's execution, you could start seeing issues or minor bugs that give you the wrong values or crash your system altogether. These issues are, primarily, caused by race conditions, an important topic for another post. The following figure shows how multiple threads can exist on multiple different CPUs: Types of threads Within a typical operating system, we, typically, have two distinct types of threads: User-level threads: Threads that we can actively create, run, and kill for all of our various tasks Kernel-level threads: Very low-level threads acting on behalf of the operating system Python works at the user-level, and thus, everything we cover here will be, primarily, focused on these user-level threads. What is multithreading? When people talk about multithreaded processors, they are typically referring to a processor that can run multiple threads simultaneously, which they are able to do by utilizing a single core that is able to very quickly switch context between multiple threads. This switching context takes place in such a small amount of time that we could be forgiven for thinking that multiple threads are running in parallel when, in fact, they are not. When trying to understand multithreading, it's best if you think of a multithreaded program as an office. In a single-threaded program, there would only be one person working in this office at all times, handling all of the work in a sequential manner. This would become an issue if we consider what happens when this solitary worker becomes bogged down with administrative paperwork, and is unable to move on to different work. They would be unable to cope, and wouldn't be able to deal with new incoming sales, thus costing our metaphorical business money. With multithreading, our single solitary worker becomes an excellent multi-tasker, and is able to work on multiple things at different times. They can make progress on some paperwork, and then switch context to a new task when something starts preventing them from doing further work on said paperwork. By being able to switch context when something is blocking them, they are able to do far more work in a shorter period of time, and thus make our business more money. In this example, it's important to note that we are still limited to only one worker or processing core. If we wanted to try and improve the amount of work that the business could do and complete work in parallel, then we would have to employ other workers or processes as we would call them in Python. Let's see a few advantages of threading: Multiple threads are excellent for speeding up blocking I/O bound programs They are lightweight in terms of memory footprint when compared to processes Threads share resources, and thus communication between them is easier There are some disadvantages too, which are as follows: CPython threads are hamstrung by the limitations of the global interpreter lock (GIL), about which we'll go into more depth in the next chapter. While communication between threads may be easier, you must be very careful not to implement code that is subject to race conditions It's computationally expensive to switch context between multiple threads. By adding multiple threads, you could see a degradation in your program's overall performance. This is an excerpt from the book, Learning Concurrency in Python by Elliot Forbes. To know how to deal with issues such as deadlocks and race conditions that go hand in hand with concurrent programming be sure to check out the book.     
Read more
  • 0
  • 0
  • 21354

article-image-emoji-scavenger-hunt-showcases-tensorflow-js
Richard Gall
03 Apr 2018
3 min read
Save for later

Emoji Scavenger Hunt showcases TensorFlow.js

Richard Gall
03 Apr 2018
3 min read
What is Emoji Scavenger Hunt? Emoji Scavenger Hunt is a game built using neural networks. Developed by Google using TensorFlow.js, a version of the machine learning library designed to run on browsers, the game showcases how machine learning can be brought to web applications. But more importantly, TensorFlow.js, which was announced at the end of March at the TensorFlow Developer Summit looks like it could be a tool to define the next few years of web development, making machine learning more accessible to JavaScript developers than ever before. Start playing now. At the moment Emoji Scavenger Hunt is pretty basic, but the central idea is pretty cool. When you open up the web page in your browser and click 'Let's Play', the app asks for access to your camera. The game then starts: you'll see a countdown, before your camera opens and the web application asks you to find an example of an emoji in the real world. If you find yourself easily irritated you're probably not going to get addicted, as Google seem to have done their best to cultivate an emoji-esque mise en scene. But the game nevertheless highlights not only how neural networks work, but also, in the context of TensorFlow.js, how they might operate in a browser. Of course, one of the reasons Emoji Scavenger Hunt is so basic is because a core part of the game is training the neural network. Presumably, as more people play it, the neural network will improve at 'guessing' what objects in the real world relate to which emoji on your keyboard. TensorFlow.js will bring machine learning to the browser What's exciting is how TensorFlow.js might help shape the future of web development. It's going to make it much easier for JavaScript developers to get started with machine learning - on Reddit a number of users were thankful that they could now use TensorFlow without touching a line of Python code. On the other hand - perhaps a little less likely - TensorFlow.js might lead to more machine learning developers using JavaScript. If games like Emoji Scavenger Hunt become the norm, engineers and data scientists will have a new way to train algorithms - getting users to do it for them. TensorFlow.js and deeplearn.js Eagle-eyed readers who have been watching TensorFlow closely might be thinking here - what about deeplearn.js? Fortunately, the TensorFlow team have an answer: TensorFlow.js... is the successor to deeplearn.js, which is now called TensorFlow.js Core. TensorFlow.js and the future of machine learning The announcement of TensorFlow.js highlights that Google and the core development team behind TensorFlow have a clear focus on the future. They're already the definitive library for machine learning and deep learning. What this will do is spread its dominance into new domains. Emoji Scavenger Hunt is pointing the way - we're sure to see plenty of machine learning imitators and innovators over the next few years.
Read more
  • 0
  • 0
  • 13257
article-image-paper-in-two-minutes-i-revnet-a-deep-invertible-convolutional-network
Sugandha Lahoti
02 Apr 2018
4 min read
Save for later

Paper in Two minutes: i-RevNet, a deep invertible convolutional network

Sugandha Lahoti
02 Apr 2018
4 min read
The ICLR 2018 accepted paper, i-RevNet: Deep Invertible Networks, introduces i-RevNet, an invertible convolutional network, that does not discard any information about the input while classifying images. This paper is authored by Jörn-Henrik Jacobsen, Arnold W.M. Smeulders, and Edouard Oyallon. The 6th annual ICLR conference is scheduled to happen between April 30 - May 03, 2018. i-RevNet, a deep invertible convolutional network What problem is the paper attempting to solve? A CNN is generally composed of a cascade of linear and nonlinear operators. These operators are very effective in classifying images of all sorts but reveal little information about the contribution of the internal representation to the classification. The learning process of a CNN works by a regular reduction of large amounts of uninformative variability in the images to reveal the essence of the visual class. However, the extent to which information is discarded is lost somewhere in the intermediate nonlinear processing steps. Also, there is a wide belief, that discarding information is essential for learning representations that generalize well to unseen data. The authors of this paper show that discarding information is not necessary and propose to explain this theory with empirical evidence. This paper also provides an understanding of the variability reduction process by proposing an invertible convolutional network. The i-RevNet does not discard any information about the input while classifying images. It has a built-in pseudo-inverse, allowing for easy inversion.  It basically uses linear and invertible operators for performing downsampling, instead of non-invertible variants like spatial pooling. Paper summary i-RevNet is an invertible deep network, which builds upon the recently introduced RevNet, where the non-invertible components of the original RevNets are replaced by invertible ones. i-RevNets retain all information about the input signal in any of their intermediate representations up until the last layer. They achieve the same performance on Imagenet compared to similar non-invertible RevNet and ResNet architectures. The above image describes the blocks of an i-RevNet. The strategy implemented by an i-RevNet consists in an alternation between additions, and nonlinear operators, while progressively down-sampling the signal operators. The pair of the final layer is concatenated through a merging operator. Using this architecture, the authors avoid the non-invertible modules of a RevNet (e.g. max-pooling or strides) which are necessary to train them in a reasonable time and are designed to build invariance w.r.t. Translation variability. Their method replaces the non-invertible modules by linear and invertible modules Sj, that can reduce the spatial resolution while maintaining the layer’s size by increasing the number of channels. Key Takeaways This work provides a solid empirical evidence that learning invertible representations does not discard any information about their input on large-scale supervised problems. i-RevNet, the invertible network proposed, is a class of CNN which is fully invertible and permits to exactly recover the input from its last convolutional layer. i-RevNets achieve the same classification accuracy in the classification of complex datasets as illustrated on ILSVRC-2012 when compared to the RevNet and ResNet architectures with a similar number of layers. The inverse network is obtained for free when training an i-RevNet, requiring only minimal adaption to recover inputs from the hidden representations. Reviewer feedback summary Overall Score: 25/30 Average Score: 8.3 Reviewers agreed the paper is a strong contribution, despite some comments about the significance of the result; i.e., why is invertibility a "surprising" property for learnability, in the sense that F(x) = {x,  phi(x)}, where phi is a standard CNN satisfies both properties: invertible and linear measurements of F producing good classification. Having said that, the reviews agreed that the paper is well written and easy to follow and considered it to be a great contribution to the ICLR conference.
Read more
  • 0
  • 0
  • 7325

article-image-differences-kubernetes-docker-swarm
Richard Gall
02 Apr 2018
4 min read
Save for later

The key differences between Kubernetes and Docker Swarm

Richard Gall
02 Apr 2018
4 min read
The orchestration war between Kubernetes and Docker Swarm appears to be over. Back in October, Docker announced that its Enterprise Edition could be integrated with Kubernetes. This move was widely seen as the Docker team conceding to Kubernetes dominance as an orchestration tool. But Docker Swarm nevertheless remains popular; it doesn't look like it's about to fall off the face of the earth. So what is the difference between Kubernetes and Docker Swarm? And why should you choose one over the other?  To start with it's worth saying that both container orchestration tools have a lot in common. Both let you run a cluster of containers, allowing you to increase the scale of your container deployments significantly without cloning yourself to mess about with the Docker CLI (although as you'll see, you could argue that one is more suited to scalability than the other). Ultimately, you'll need to view the various features and key differences between Docker Swarm and Kubernetes in terms of what you want to achieve. Do you want to get up and running quickly? Are you looking to deploy containers on a huge scale? Here's a brief but useful comparison of Kubernetes and Docker Swarm. It should help you decide which container orchestration tool you should be using. Docker Swarm is easier to use than Kubernetes One of the main reasons you’d choose Docker Swarm over Kubernetes is that it has a much more straightforward learning curve. As popular as it is, Kubernetes is regarded by many developers as complex. Many people complain that it is difficult to configure. Docker Swarm, meanwhile, is actually pretty simple. It’s much more accessible for less experienced programmers. And if you need a container orchestration solution now, simplicity is likely going to be an important factor in your decision making. ...But Docker Swarm isn't as customizable Although ease of use is definitely one thing Docker Swarm has over Kubernetes, it also means there's less you can actually do with it. Yes, it gets you up and running, but if you want to do something a little different, you can't. You can configure Kubernetes in a much more tailored way than Docker Swarm. That means that while the learning curve is steeper, the possibilities and opportunities open to you will be far greater. Kubernetes gives you auto-scaling - Docker Swarm doesn't When it comes to scalability it’s a close race. Both tools are able to run around 30,000 containers on 1,000 nodes, which is impressive. However, when it comes to auto-scaling, Kubernetes wins because Docker doesn’t offer that functionality out of the box. Monitoring container deployments is easier with Kubernetes This is where Kubernetes has the edge. It has in-built monitoring and logging solutions. With Docker Swarm you’ll have to use third-party applications. That isn’t necessarily a huge problem, but it does make life ever so slightly more difficult. Whether or not it makes life more difficult to outweigh the steeper Kubernetes learning curve however is another matter… Is Kubernetes or Docker Swarm better? Clearly, Kubernetes is a more advanced tool than Docker Swarm. That's one of the reasons why the Docker team backed down and opened up their enterprise tool for integration with Kubernetes. Kubernetes is simply the software that's defining container orchestration. And that's fine - Docker has cemented its position within the stack of technologies that support software automation and deployment. It's time to let someone else take on the challenge of orchestration But although Kubernetes is the more 'advanced' tool, that doesn't mean you should overlook Docker Swarm. If you want to begin deploying container clusters, without the need for specific configurations, then don't allow yourself to be seduced by something shinier, something ostensibly more popular. As with everything else in software development, understand and define what job needs to be done - then choose the right tool for the job.
Read more
  • 0
  • 1
  • 56671