Apple avoids AI hype at WWDC keynote by integrating ML into Ars Technica products

Someone scans their face with Apple Vision Pro during a WWDC 2023 keynote demo reel.

Someone scans their face with Apple Vision Pro during a WWDC 2023 keynote demo reel.
Zoom in / Someone scans their face using Apple’s “most advanced machine learning techniques” with Apple Vision Pro during a demo reel at WWDC 2023.

Apple

Among notable new products like the Apple Silicon Mac Pro and Apple Vision Pro revealed at Monday’s WWDC 2023 keynote event, Apple presenters never mentioned the term “AI,” a notable omission given that its competitors like Microsoft and Google have focused heavily on generative AI at the moment. However, AI was part of Apple’s presentation, just under other names.

While “AI” is a very ambiguous term these days, surrounded by incredible advances and extreme hype, Apple has chosen to avoid that association and has instead focused on terms like “machine learning” and “ML”. For example, during the iOS 17 demo, SVP of Software Engineering Craig Federighi talked about improvements to autocorrect and dictation:

Autocorrect relies on on-device machine learning, and we’ve continued to improve these models over the years. The keyboard now leverages a transformer language model, which is state-of-the-art for word prediction, making autocorrect more accurate than ever. And with the power of Apple Silicon, iPhone can run this pattern whenever you touch a key.

Notably, Apple mentioned the AI ​​term “transformer” in an Apple keynote. The company specifically spoke of a “transformer language model,” meaning that its AI model uses the transformer architecture that has powered many recent generative AI innovations, such as the DALL-E image generator and the ChatGPT chatbot.

A transformer model (a concept first introduced in 2017) is a type of neural network architecture used in natural language processing (NLP) that employs a self-attention mechanism, which allows it to prioritize different words or elements in a sequence. Its ability to process inputs in parallel has led to significant efficiency improvements and has enabled progress in NLP tasks such as translating, summarizing, and answering questions.

Apparently, Apple’s new Transformer model in iOS 17 allows for sentence-level autocorrections that can end a word or an entire sentence when you hit the spacebar. It also learns from your writing style, which drives its suggestions.

All of this on-device AI processing is easy enough for Apple thanks to a special portion of Apple Silicon chips (and earlier Apple chips, starting with the A11 in 2017) called the Neural Engine, designed to accelerate machine learning applications. Apple also said that Dictation “gets a new transformer-based speech recognition model that leverages the Neural Engine to make dictation even more accurate.”

A screenshot of Craig Federighi talking about autocorrect in iOS 17, which now uses a
Zoom in / A screenshot of Craig Federighi talking about autocorrect in iOS 17, which now uses a “transformer language model”.

Apple

During the keynote, Apple also mentioned “machine learning” several other times: describing a new iPad screen lock feature (“When you select a Live Photo, we use an advanced machine learning model to synthesize additional frames”); iPadOS PDF functionality (“Thanks to new machine learning models, iPadOS can identify fields in a PDF so you can use AutoFill to quickly fill them with information like names, addresses, and emails from your contacts.”); an AirPods Adaptive Audio feature (“With personalized volume, we use machine learning to understand your listening preferences over time”); and an Apple Watch widget feature called Smart Stack (“Smart Stack uses machine learning to show you relevant information just when you need it”).

Apple also launched a new app called Journal that allows personal text and picture journaling (a bit like an interactive journal), locked and encrypted on your iPhone. Apple has said that AI plays a role, but has not used the term “AI”.

“Using on-device machine learning, your iPhone can create personalized suggestions of moments to inspire your writing,” Apple said. “The suggestions will be intelligently curated from information on your iPhone, like your photos, location, music, workouts and more. And you control what to include when you enable the suggestions and which ones save in your journal.”

Finally, during the demo for the new Apple Vision Pro, the company revealed that the moving image of a user’s eyes on the front of the glasses comes from a special 3D avatar created by scanning your face and you guessed it, machine learning .

“Using our most advanced machine learning techniques, we’ve created a new solution,” Apple said. “After a quick registration process using the front sensors on the Vision Pro, the system uses an advanced encoder-decoder neural network to create your digital Persona.”

The user's eye display on Apple's Vision Pro headset is created by scanning the face and "advanced machine learning techniques."
Zoom in / The user’s eye display on Apple’s Vision Pro headset is created through face scanning and “advanced machine learning techniques.”

An encoder-decoder neural network is a type of neural network that first compresses an input into a compressed numerical form called a “latent space representation” (the encoder), then reconstructs the data from the representation (the decoder). We’re speculating, but the encoder part could parse and compress the facial data captured during the scanning process into a smaller, more manageable latent representation. Then, the decoder part could use that condensed information to generate its 3D model of the face.

The power of artificial intelligence M2 Ultraan?

The impressive specs of the M2 Studio, according to reports from Apple.
Zoom in / The impressive specs of the M2 Studio, according to reports from Apple.

Apple

During the WWDC keynote, Apple unveiled its most powerful Apple Silicon chip, the M2 Ultra, which features up to 24 CPU cores, 76 GPU cores and a 32-core Neural Engine reportedly offering 31.6 trillion operations per second, according to Apple. it represents 40 percent faster performance than the M1 Ultra.

Interestingly, Apple directly stated that this power could come in handy for training “models of large transformers,” which to our knowledge is the most prominent mention of AI in an Apple keynote (albeit just a escaped):

And the M2 Ultra can support a whopping 192GB of unified memory, which is 50% more than the M1 Ultra, allowing it to do things other chips can’t. For example, in a single system, it can train huge ML workloads, such as large transformer models that the most powerful discrete GPU can’t even process because it runs out of memory.

This development has some AI experts excited. On Twitter, frequent AI expert Perry E. Metzger he wrote“Whether by accident or intentionally, Apple Silicon’s unified memory architecture means that high-end Macs are now truly amazing machines for running large AI models and AI research. There really aren’t many other systems at this price point that offer 192GB of GPU-accessible RAM.”

Here, larger RAM means that larger and seemingly more capable AI models can fit in the memory. The systems are the new Mac Studio (starting at $1,999) and new Mac Pro (starting at $6,999), which could potentially put AI training within reach of many new people and in the form factor of desktop and tower machines .

Only rigorous evaluations will tell how the performance of these new M2 Ultra machines will stack up against AI-optimized Nvidia GPUs like the H100. For now, it appears that Apple has openly thrown its hat in the ring of generative AI training hardware.


#Apple #avoids #hype #WWDC #keynote #integrating #Ars #Technica #products

Leave a Reply

Your email address will not be published. Required fields are marked *