Wallstreetcn
2023.11.09 08:41
portai
I'm PortAI, I can summarize articles.

GPT is bigger, smarter, faster, cheaper, and simpler! OpenAI is the "most exciting company"

The coming months and years will be absolutely crazy.

How exciting is the "Tech Spring Festival" in the AI field, the OpenAI Developer Conference?

On November 7th, after the OpenAI Developer Conference, tech blogger Dan Shipper published an article about his observations at the conference. He introduced the major updates proposed by OpenAI and described the astonishing progress of OpenAI, which he believes will be "crazy" in the coming months and years.

Shipper mentioned that the GPT-4 Turbo launched by OpenAI has five major upgrades: larger context capacity, higher intelligence level, faster response speed, lower price, and simpler operation. Not only has the model itself been enhanced, but the interaction with the model has also become simpler and more convenient. The new retrieval function and automatic maintenance of dialogue state make it easier for developers to build applications, and the no-code custom ChatGPT lowers the threshold for ordinary users.

He believes that these features lay the foundation for future updates of OpenAI's agent services. The so-called agent refers to a model that can autonomously plan and execute more complex, multi-step tasks, and complete them without supervision. Although GPT-4 is not yet "smart" enough, OpenAI is already preparing for this goal.

Shipper also analyzed OpenAI's strategy of building an application store. He believes that this strategy allows users to build their own personalized GPT and charge for it, democratizing the ability to build chatbots. However, this strategy also has its problems, such as users possibly feeling exhausted from switching between different versions of ChatGPT - an issue that Shipper believes OpenAI needs to address.

Finally, he mentioned the delicate relationship between OpenAI and developers. He pointed out that many of the recent updates released by OpenAI are more targeted towards consumers rather than developers, even though OpenAI's initial goal was to serve developers. This creates a contradiction, as ChatGPT will directly compete with developers.

Shipper believes that if OpenAI has to choose between ChatGPT and its developer ecosystem, it will choose the former:

ChatGPT is the most valuable source of high-quality training data for OpenAI, so it is the best way to improve model quality.

Shipper added that this is actually a core issue faced by many tech companies. For example, Apple has been criticized for competing with third-party developer products with its internal products, but this problem may be more severe for OpenAI:

It's like Apple allowing developers to release their own versions of iOS.

But overall, Shipper believes that this event by OpenAI is still exciting and demonstrates the astonishing pace of progress of the company. Now there is no company that is more interesting and faster than OpenAI. This company's progress is astonishing, and there is no sign of slowing down in the foreseeable future.

At this conference, the industry consensus is that OpenAI is a powerhouse of talent and gives a strong impression similar to Stripe in its heyday. (In fact, I heard that OpenAI has hired many people who used to work at Stripe.)

The energy in the room is palpable. I believe there is no bigger or more exciting story in the tech industry. The coming months and years will be absolutely crazy.

Below is the original article published by Shipper on his tech news website Every, compiled and translated by Wall Street News:

My Observations at OpenAI Developer Day

I enjoy observing people when they think no one is paying attention.

That's the beauty of attending events like OpenAI Developer Day: you get to see things that the cameras don't capture and hear words that are not spoken on stage.

The venue is packed with people, bustling with activity. The WiFi is lightning fast, and the LED lights are shining brightly. It's a magic show designed for AI enthusiasts like me.

I navigate through the crowd, performing my patented move, the Furtive Conference Ogle (FCO): sneakily peeking at the conference. I might catch a glimpse of someone famous, like Roon, Karpathy, or Kevin Roose, and quickly glance at their badges before looking back at their faces, as if to say, "Hey, my eyes are up here, buddy!"

I usually prefer sitting in the back during events, but for Developer Day, I made sure to secure a seat in the front row. I wanted to get a close-up view of this magic show.

Sam Altman takes the stage and greets the audience. As he performs, I can see the tension, the restraint, and the nervous energy in his face and body. I can sense the hours of practice he has put in. After a brief opening monologue, Sam introduces a video showcasing creative professionals, developers, and ordinary people talking about how they use ChatGPT. The lights dim, and he steps aside as the video begins. Everyone is watching the video, but I'm watching Sam.

He stands alone in the shadows at one corner of the stage. He's wearing dark jeans and a pair of original Adidas x Lego collaboration sneakers. His hands are clasped together, and his gaze is fixed on the floor. Sam is tense, always "on edge." But on the side of the stage, listening to the video playing, he appears relaxed, carefree. It's as if I've caught hold of the magician's left hand, hiding a coin, while the audience is focused on his right hand waving around.

Temporarily seeing through the magician's trick would break their spell. But it also creates a new kind of magic: you see the magician as a human being. Eating, breathing, putting on pants one leg at a time, yet still performing magic tricks. Sam is becoming a legendary figure in the tech industry. But in that moment on stage, he is also just a person. He seems to be enjoying himself, observing and anticipating what he has created, watching it unfold on the world's biggest stage. He has achieved the dreams of everyone who has ever made something and hoped the world would love it.

Witnessing that moment was worth the price of admission. I won't forget it anytime soon.

Here's what he wants to tell us:

Bigger, Smarter, Faster, Cheaper, Simpler.

That's the main change OpenAI announced yesterday. Let's review these updates one by one and discuss why they are so important.

A New Model: GPT-4 Turbo

Bigger

OpenAI has launched a new model, GPT-4 Turbo, which has a context window of 128K tokens. This means that every prompt you send to GPT-4 Turbo can be equivalent to 300 pages of text. The following things are within 300 pages:

  • The entire content of Eric Ries' "The Lean Startup"
  • Three copies of Antoine de Saint-Exupéry's "The Little Prince"
  • At least half of my moody diary from middle school

This is a 16-fold increase in context window length compared to the previous widely used version, GPT-4. It significantly enhances the complexity and functionality of queries that developers can run with GPT-4. Previously, developers had to spend time and effort deciding which information to include in their prompts, which was one of the most important bottlenecks for LLM performance.

The 128K context window greatly simplifies this task, but it doesn't solve all the problems. Overly long context windows are difficult to manage, and the language model will increasingly forget or ignore contextual information. We don't yet know if GPT-4 Turbo has these issues, and I will share with you as I use it.

Smarter

GPT-4 Turbo is smarter than OpenAI's previous models in the following ways:

It can use multiple tools simultaneously. The previous version of GPT-4 introduced tool usage, which I have reported on. Tool usage allows GPT-4 to invoke developer-defined tools, such as web browsing, calculators, or APIs, to complete queries. Previously, GPT-4 could only use one tool at a time. Now it can use multiple tools simultaneously.

Knowledge cutoff date update. The previous version of GPT-4 only knew events up until September 2021. This version is updated to April 2023, making it more reliable.

GPT-4 speaks JSON. JSON is a text format that non-AI applications can easily read. GPT-4 Turbo can reliably return results in this format, making it easier to integrate with other software. Previously, developers had to "trick" it into formatting the output correctly, for example, by telling GPT that it would be fired if the format was wrong. No more need for deception.

GPT-4 can now write and run code. For a while, ChatGPT Plus users have been able to use the code interpreter (later renamed Advanced Data Analysis), a ChatGPT plugin that allows you to write and run Python code. It's like having a data scientist in your pocket - now developers can use and integrate it into their own programs through the GPT-4 API.

It's multimodal. The GPT-4 API can accept images as input: developers can send any image and GPT-4 can tell them what it sees. It can also do text-to-speech, meaning it can respond to text inputs with human voice. It can even generate images using DALL-E.

It's faster.

There are no publicly available speed benchmarks as far as I know, but Sam says it's faster. Based on my scientific testing last night in my pajamas, he's right. It's really fast. It leaves GPT-4 in the dust, looking at least as fast as GPT 3.5 Turbo, if not slightly faster - the previous fastest model.

It's cheaper.

GPT-4 Turbo is three times cheaper than GPT-4. I can't remember any company that can significantly improve performance while lowering prices.

We're lucky that OpenAI plays by Silicon Valley's rules, aiming to create mass applications rather than just high-priced corporate contracts. As long as it's affordable enough, artificial intelligence can be accessible to everyone, and that's exactly what OpenAI aims for.

If IBM had invented GPT, do you think they would do something like this? No.

It's simpler.

OpenAI also makes it easier for developers and non-developers to interact with GPT-4 Turbo. The company eliminates the need for many third-party libraries' functionalities (as well as the template code developers usually write). Here are some ways:

Retrieval. This is a big step forward. One of the most important ways to improve the performance of large language models is to allow them access to private data, such as company knowledge bases or personal notes. Previously, this functionality required manual construction (like what I did for my Huberman Lab chatbot) or the use of third-party libraries like Langchain or LlamaIndex (I'm an investor in the latter). OpenAI has integrated some of the functionalities of these libraries into its core API through its retrieval feature - making it easier for developers to start building GPT-4 applications.

This will lead to interesting results. On the one hand, it reduces the need for these third-party libraries. On the other hand, OpenAI's retrieval mechanism is currently a black box with no configurability. Retrieval is a challenging problem, with many different retrieval mechanisms for different purposes. OpenAI's new release covers the basics, but Langchain and LlamaIndex implement various types of retrieval and are applicable to models not made by OpenAI - so there is still demand for their services. Saving state. As I mentioned before, GPT-4 is like Lucy Whitmore in "50 First Dates": every time you interact with it, you have to introduce yourself and explain why it loves you all over again. With the new "Threads" feature in the GPT-4 API (unrelated to Meta's Twitter clone), it can automatically remember the conversation history, saving developers time and trouble as they no longer need to manage the conversation history themselves.

Custom no-code ChatGPT. OpenAI has also made it easy for anyone to build their own custom version of ChatGPT with built-in private data, without the need for programming. Anyone can set up a ChatGPT version with its own personality and the ability to access private knowledge. This is a significant advancement. Earlier this year, I built a bot based on Lenny Rachitsky's newsletter archive for Substack authors. The latest update means that anyone can build an equivalent bot without coding.

GPT App Store. OpenAI announced that anyone can list their GPT in the public App Store and charge for it. I have been advocating for chatbots as a new form of content for almost a year, and this development supports that argument.

No need to switch models. This is a huge update. In previous versions of ChatGPT, you had to choose which model to use: GPT-3.5, GPT-4, GPT with DALL-E, GPT with Web Browsing, or GPT with Advanced Data Analysis. Now, you just send a message to ChatGPT and it will choose the appropriate model for you. Users can easily combine different features of ChatGPT without switching back and forth, creating new opportunities for developers (which will be covered later in this article).

Incremental updates - laying the foundation for the future

All of these updates are great, but they are mostly incremental. They build upon the tasks that many developers had to do themselves in the API, making what developers build faster, cheaper, and more powerful.

However, these features lay the foundation for a potentially more important update: agents. Agents are models that can be assigned complex, multi-step tasks and complete them without supervision. This is the new Assistant API for GPT-4.

This API supports retrieval, saving state, and tool usage (as mentioned above). These elements combined are the beginning of agent services. From the current situation, it seems that OpenAI is envisioning a world where you can assign a goal to an assistant, give them a set of tools, and let them accomplish the goal on their own. We are still far from that point because GPT-4 is not intelligent enough to plan and execute tasks autonomously. However, OpenAI is currently laying the groundwork and security infrastructure and intends to launch in progressive steps to ensure technological readiness.

OpenAI is trying to create an app store

In April of this year, OpenAI introduced plugins that allow users to access third-party services and data from within ChatGPT. There has been a lot of hype about plugins becoming a new App Store, but that is not the case. OpenAI has never released any relevant data, but from what I know, the adoption rate of third-party plugins is very low, despite the high adoption rate of the two plugins built by OpenAI: the code interpreter and DLL-E.

Now, OpenAI is attempting this again with GPT - its service allows anyone to create a customized version of ChatGPT using private data:

Any user can create their own GPT. You can define its personality: how it responds to inquiries, what voice and tone it uses. You can give it access to tools such as the ability to execute code or obtain answers from a private knowledge base. Then, you can publish the GPT for other users to use.

I installed a new GPT called "Negotiator" (built by OpenAI), which can help you advocate for yourself in any type of negotiation. It appears in my ChatGPT sidebar as follows:

If I click on Negotiator, it takes me out of the regular ChatGPT and into a specially designed experience that helps me achieve the best results in any negotiation:

I really like this approach. I like the idea of democratizing the ability to build chatbots - I can foresee doing a lot of experimentation here in the coming weeks.

However, I still have concerns. It faces the same problem as the failed plugin experiment by OpenAI: no one wants to switch between different versions of ChatGPT for different use cases.

A better approach would be to allow ChatGPT to automatically switch to a specific personality, such as "negotiation expert," when needed, and switch back when not needed. Until this happens, I don't see these bots being widely adopted. But if it happens, it will be huge. Downloading a new personality for ChatGPT would be equivalent to having your AI read a book on a new topic or take a course. In this world, there will be an entire economy of content created by people specifically for LLMs, not humans. For example, I might purchase the equivalent of negotiation books that ChatGPT can read and digest, instead of buying a negotiation book to read myself.

Therefore, I believe that OpenAI does have the potential to eventually establish an app store experience. However, this won't happen until they figure out how to automatically switch ChatGPT between long lists of personalities. Given that OpenAI has made changes to ChatGPT so that you don't have to switch between its internal models, this may also come soon for custom GPTs.

OpenAI's Relationship with Developers

One notable aspect of this developer conference is that many of the updates released by OpenAI are more focused on consumers rather than developers. For example, Custom GPT is consumer-oriented, as are some of the specific updates released for ChatGPT. This reflects an important fact: OpenAI is currently positioned between being a consumer company and a developer company.

ChatGPT was born with original sin. When OpenAI first started, its goal was to serve developers - until it accidentally created the largest consumer application in history. Unfortunately, this put the company at odds with developers because ChatGPT directly competes with many things developers want to build, both at the consumer level and the infrastructure level.

If OpenAI had to choose between ChatGPT and its developer ecosystem, it would have to choose ChatGPT. ChatGPT is the most valuable source of high-quality training data for OpenAI, so it is the best way to improve model quality.

Furthermore, OpenAI is also moving towards commercializing and consumerizing its development work. ChatGPT itself can turn anyone into a semi-competent programmer. The feature it launched yesterday allows anyone to build chatbots without needing to code.

This is a fundamental tension at the core of the company. It is also a tension that exists in many platforms - for example, Apple faces a tension between iOS and MacOS. Apple has been criticized for competing with third-party developer products with its own internal products, which is referred to as "Sherlocking".

But for OpenAI, this is even more of a problem because its consumer products are remarkably similar to the products it offers to developers. It's like Apple allowing developers to release their own versions of iOS.

I guess if you want to play a role in the OpenAI ecosystem, the best way is to collect private datasets that would be useful for someone using ChatGPT and release them as custom GPTs. OpenAI may invest in making GPT more accessible and powerful in the ChatGPT interface over time. The advantage you bring to the party will be private, curated data - along with a set of rules on how to apply this data to specific types of users in specific situations. This is likely not something that OpenAI wants to directly compete with - so it's a win-win situation.

The most exciting company in the world

There is no company doing more interesting and faster work than OpenAI right now. The progress of this company is astonishing, and there are no signs of slowing down in the foreseeable future. The industry consensus at this conference is that OpenAI is a talent powerhouse and feels very much like Stripe in its heyday. (In fact, I heard that OpenAI has hired many people who used to work at Stripe.)

The energy in the room is palpable. I believe there is no bigger or more interesting story in the tech industry. The coming months and years are going to be crazy.

Miscellaneous

Diversity. I appreciate the inclusivity of this conference. As far as I know, the food provided by the company is delicious and comes from local women-owned or minority-owned businesses. The speakers for the demos and group discussions are very diverse, holding leadership positions at OpenAI, Shopify, Salesforce, and other major tech companies. It's all very understated, without any showmanship. In my opinion, OpenAI is doing things right and deserves praise.

OpenAI and Microsoft. One attendee told me that the relationship between OpenAI and Microsoft reminds him of the long-standing partnership between Apple and Intel. The processors are made by Intel, and everything else is done by Apple. In the case of OpenAI, Microsoft provides the hosting infrastructure while OpenAI takes care of everything else. It's not a perfect analogy, but it resonated with me, especially when Satya Nadella appeared at this conference and stood on stage with Sam during his keynote speech.

Can anyone make sense of OpenAI's naming? I can't believe they named their new custom codeless ChatGPTs "GPTs". Someone needs to step in and intervene - it's just too confusing.