Technology

End-to-end Entity Resolution for Big Data — Introduction to the entity resolution pipeline and the algorithms at the different stages. Includes a summary of open source tools and their features. (via Adrian Colyer) 33 Engineering Challenges of Building Mobile Apps at Scale — Part 1, covering the first...
The last month of the old year showed a lot of activity on the border of AI and biology. The advances in protein folding with deep learning are a huge breakthrough that could revolutionize drug design. It’s important to remember the role AI had in developing the vaccine for...
A few months ago, I said that “making everything into a design pattern is a sign that you don’t know what design patterns really are.” So now, I feel obliged to say something about what design patterns are. Design patterns are frequently observed solutions to common problems. The idea comes...
“he threats to consumers arising from data abuse, including those posed by algorithmic harms, are mounting and urgent.” FTC Commissioner Rebecca K. Slaughter Variants of artificial intelligence (AI), such as predictive modeling, statistical learning, and machine learning (ML), can create new value for organizations. AI can also cause costly reputational damage,...
In this report, we look at the data generated by the O’Reilly online learning platform to discern trends in the technology industry—trends technology leaders need to follow. But what are “trends”? All too often, trends degenerate into horse races over languages and platforms. Look at all the angst heating up...
The unwilling star of this month’s trends is clearly Facebook. Between reports that they knew about the damage that their applications were causing long before that damage hit the news, their continued denials and apologies, and their attempts to block researchers from studying the consequences of their products, they’ve...
Kevlin Henney and I were riffing on some ideas about GitHub Copilot, the tool for automatically generating code base on GPT-3’s language model, trained on the body of code that’s in GitHub. This article poses some questions and (perhaps) some answers, without trying to present any conclusions. First, we wondered...
Much has been written about struggles of deploying machine learning projects to production. As with many burgeoning fields and disciplines, we don’t yet have a shared canonical infrastructure stack or best practices for developing and deploying data-intensive applications. This is both frustrating for companies that would prefer making ML...
The introduction of data science into the business world has contributed far more than recommendation algorithms; it has also taught us a lot about the efficacy with which we manage our businesses. Specifically, data science has introduced rigorous methods for measuring the outcomes of business ideas. These are the...
While October’s news was dominated by Facebook’s (excuse me, Meta’s) continued problems (you’d think they’d get tired of the apology tour), the most interesting news comes from the AI world. I’m fascinated by the use of large language models to analyze the “speech” of whales, and to preserve endangered...

Latest Posts