Last week, we hosted an exciting Ask Me Anything (AMA) session that dove into the world of generative AI, Pieces Copilot, and what that means for developer productivity. CEO Tsavo Knott, Lead ML Engineer Brian Lambert, Head of Engineering Mark Widman, and Head of Growth Cole Stark took the virtual stage to answer questions, discuss individual vs enterprise workflows as it relates to generative AI, and share insights on the new updates to Pieces Copilot as well as the roadmap for the future and how we compare to other generative AI startups.

Whether you're a seasoned developer, a project manager, or just someone interested in the technological advances impacting the software development landscape, this AMA provided both general and technical information that clears up some of the mystery around generative AI and potential capabilities.

In case you missed out on attending the live event, read through this blog post where we’ll summarize the key takeaways, highlight some of the most thought-provoking questions and answers, and even share video snippets of the discussion to get you up to speed.

Topics Covered:

Introductions (00:27) - Meeting the team, along with a brief outline to set the stage.
Opportunities and Challenges of Generative AI Startups (04:09 - 14:17) - Discussing the hurdles and opportunities in the world of Generative AI, as well as where we are in 2023.
Individual vs Enterprise Workflows (14:18 - 18:27) - How AI can augment developer workflows from the browser, to the IDE, to the collaborative environment.
Demo Session: Exploring the New Features of Pieces OS 6.0.0 (18:28 - 27:33) - A hands-on walkthrough of Pieces Copilot: Storing snippets, setting custom context, using multimodal inputs, and much more.
Local vs Cloud LLMs (27:34 - 31:35) - The pros and cons of local and cloud-based large language models.
Copilot Integrations (31:36 - 49:03) - The many ways Pieces Copilot integrates across your workflow and its awareness of team roles.
Upcoming Features (49:04 - 52:49) - A sneak peek into what's on the horizon, including workspaces, go-to-market plan, and even a glimpse into Pieces for Designers!
Workflow Activity, Data Storage, and Security (52:50 - 01:03:39) - How Pieces handles data storage and sensitive information, and Retrieval Augmented Generation considerations.
AI for Education + Technical Requirements (01:03:40 - 01:15:47) - More questions around how generative AI can facilitate education, plus the nitty-gritty of LLM runtimes and machine requirements for local and cloud AI.
Automatic Context Setting & Re-Grounding (01:15:48 - 01:21:55) How Pieces Copilot understands your workflow for accurate code generation, and streamlines workflows for anyone working with file fragments.
Prize Giveaways & Closing Remarks (01:21:56 - 01:24:06) - Wrapping up the AMA with some exciting giveaways and final thoughts.

So, sit back, grab a cup of coffee, and let's revisit the highlights of our latest live stream with the developer community.

Opportunities and Challenges of Generative AI Startups

The discussion opened with a candid observation: everything in the development landscape is scaling upwards. More code, more features, more contributors—each facet is growing in complexity and volume.

This high velocity brings with it an equal measure of chaos. Not only developers but also project managers and entire teams are dealing with the rapid pace of development, risking overlooking crucial information in the process. The consensus was that while speed is invaluable, managing, curating, and iterating on it is equally crucial to reap the benefits of boosted productivity, and generative AI startups should focus their development on addressing those concerns.

The Blind Spots in Rapid Development

We highlighted an often-overlooked facet of using generative AI for coding: the potential to miss important edge cases. As the technology can rapidly churn out code, it may not always account for the unique intricacies of a given system, possibly leading to unforeseen errors or system failures. The concern here is not just about the code that is generated, but also about the developer's understanding of that code.

Educating Around the Code

Generative AI software is excellent at producing code, but how well do we understand this generated output? Inline comments and documentation become increasingly challenging to manage when code is generated at such a high pace. Understanding the code is crucial for both its implementation and maintenance.

The Evolution and Future of Generative AI

The conversation then transitioned to the evolutionary stages of large language models (LLMs) like ChatGPT and Bard. Initially, these generative AI large language models could perform a multitude of tasks without specific programming—ranging from crafting poems to writing C++ code. But limitations soon became apparent, primarily around data retrieval and fine-tuning.

To address these issues, two significant advancements were discussed:

Retrieval Augmented Generation (RAG): This method enhances the output by pulling in external data based on the context of the query.
Prompt Tuning: A more targeted approach where the prompt is fine-tuned using machine learning models to generate better code.

Generative AI Use Cases Beyond Code Generation

Generative AI's role is not limited to merely churning out code. It also aids in generating metadata, tags, and descriptions—essentially streamlining multiple facets of a developer's workflow. The focus is on creating use cases of generative AI that aren’t a one-size-fits-all but tailored for specific tasks, incorporating traditional machine learning and data retrieval techniques to maximize the output quality.

Individual vs Enterprise Workflows

The conversation then pivots towards the contrast between workflows for individual developers and those in an enterprise setting. We mention the universality of challenges across both settings. Three main pillars constitute the developer’s day—browser interactions, working within the Integrated Development Environment (IDE), and engaging in collaborative spaces like Microsoft Teams, Discord, GitHub, Google Chat or Slack.

Tsavo elaborates that Pieces aims to optimize the workflow for individual developers, focusing on hour-by-hour productivity. As this efficiency scales up to a team level, it creates a 'compounded effect of productivity,' essentially forming a decentralized, shared knowledge base within a team. This is the goal behind the cloud features of Pieces, allowing various drives to interact and share information seamlessly.

Enterprise Generative AI Features in Progress

Additional Enterprise features for our generative AI startup are also in development, including more support around in-line comments, permission-based sharing, and real-time sync capabilities. These features aim not just to streamline workflow but also to foster connectivity among team members or even with outside experts like blog authors and YouTube publishers. Pieces Copilot uses its AI capabilities to understand who the developer needs to connect with, whether it's for an AI code review, pair programming, or educational content.

The overarching message is that optimizing the experience for the individual developer is crucial. If the workflow isn't right for the individual, it will not scale effectively at the enterprise level. Finally, we discuss how Pieces strives to minimize the 'context-switching chaos' that hampers productivity, aiming to keep developers in flow within their existing applications. This focus is viewed as essential for enhancing developer productivity on both individual and enterprise levels.

Demo Session: Exploring the New Features of Pieces OS 6.0.0

In the Pieces Copilot demo session, Tsavo dives into the capabilities of the latest version of Pieces OS 6.0.0, demonstrating the advanced features that aim to revolutionize the way developers interact with their codebase. The demo revolves around several key areas: generative AI with large language models, context setting, and the Global Search functionalities, among others. The demo also previews some exciting updates coming down the pipeline.

Generative AI and Code Snippets

Tsavo highlights the applications of generative AI that make Pieces Copilot a game-changer for developers. When you paste a piece of code into the desktop app, the on-device AI identifies the programming language, whether it's TypeScript, Dart, or another. It automatically generates a title, description, and other context-related information like suggested tags and links to related documentation. This "contextual richness," as Tsavo calls it, makes it easier to search and reuse these snippets later on.

Context-Driven Copilots

One of the standout features of this release is the ability for users to set multiple types of contexts for their conversation—website URLs, saved snippets, and even entire directories—that the Copilot uses to ground its recommendations. Tsavo shows this by adding URLs related to Flutter image pickers and by selecting saved snippets, instantly preparing the Copilot to understand his current workflow needs. This is especially useful when the developer is working on a specific project and needs specialized guidance.

Security Measures

Tsavo stresses the importance of on-device security, explaining that Pieces conducts an on-device security analysis to ensure that sensitive data like API keys are redacted before being used as context for the Copilot—something that even works in an air-gapped, offline state. More on the technical details behind this are included later in the conversation.

Multimodal Inputs and Responses

In a move toward more flexible interaction, Tsavo demonstrates that Pieces Copilot can even accept screenshots as inputs, extract the code using OCR technology, and provide targeted advice. In his demo, he queries how to set up a Flutter image picker unit test, and the Copilot generates a comprehensive response that includes YAML file configuration and unit test code, which he then saves for future reference.

Global Search

Global Search acts as an "offline Google," allowing users to search through all their saved snippets. It is especially powerful when coupled with the rich context that LLMs and generative AI add to each snippet. Tsavo shows how easy it is to find a previously saved code snippet on Flutter image picker unit tests, demonstrating the holistic workflow that Pieces enables.

Upcoming Features

Before wrapping up, Tsavo teases an upcoming feature that will likely change the generative AI startup landscape, allowing users to input a video URL, such as a tutorial from YouTube. The system will then run OCR on most of the video frames to extract code snippets and even pull the transcript, thereby adding an additional layer of context to help generate more accurate and relevant code for your project.

Local vs Cloud LLMs

To kick off the Q&A portion of the live stream, the conversation shifted towards the type of LLMs powering Copilot and Pieces for Developers. Mark pointed out that we are currently using GPT-3.5 for their cloud-based operations. However, he revealed that a significant portion of their recent work is focused on integrating a local LLM into the application.

Local LLMs: A Game-Changer

The local LLM that is in the works is based on the Code Llama model released by Meta. According to Mark, this foundational model offers an impressive balance between performance and model size.

What sets this apart is its ability to run effectively whether you have a CPU or a dedicated GPU. If your machine has a powerful GPU, you can expect exceptional performance. This local implementation is expected to make it into Pieces Copilot and the various plugins soon.

The local LLM allows users to turn their Wi-Fi off and have an entirely air-gapped, offline language model, without losing any of the feature set. Pieces is adopting a generic approach to integrating these LLMs, making it easier to keep up with rapidly evolving technologies.

Users can customize their experience based on the processing power of their machine; you can choose a model that is fast and less accurate or one that is slow but more accurate, tailoring it to your development needs.

Memory and Resource Management

Brian added an important point on resource allocation and memory management, which are common concerns among developers and obstacles for building a startup with generative AI. He mentioned that we are planning to release the first local models next week. These models will be highly modular, enabling the toggling on and off of various components depending on what is needed. This helps in optimizing RAM usage.

For instance, when engaging with Copilot, the model boots up quickly and reallocates resources to the OS after a period of inactivity to ensure optimal performance. They're also working on memory management features that would make the tool more efficient, using fine-tuned models like T5 from Salesforce to optimize resource consumption.

The Future is Bright

One of the most exciting aspects revealed is the adaptability of our machine learning pipeline. It is designed to accommodate both small language models that power specific features like search, tags, and titles, as well as larger, more comprehensive language models.

With new models being released almost every day, the infrastructure at Pieces for Developers is agile enough to quickly integrate these advancements, keeping the copilot up to date with the latest technologies in LLMs and the generative AI landscape.

Copilot Integrations Across your Workflow

One of the most discussed topics during our recent AMA was the game-changing potential of Pieces Copilot and its wide-ranging integrations. The audience was particularly curious about how it fits into their daily workflows, spanning different apps, environments, and even teams.

A Context-Aware Assistant for Every Application

Next, we discussed Pieces Copilot's AI-powered cross-platform capabilities. Not just limited to a single IDE or note-taking app, Copilot has been designed to operate at the operating system level.

This allows for a rich, context-aware experience regardless of the application in use. Imagine working in VS Code, taking a break to jot down some notes in Obsidian, and having a consistent Copilot experience throughout. It's not just about having an intelligent code assistant; it's about having comprehensive generative AI coding tools that are in tune with your broader project objectives.

Your Team, Amplified

Pieces Copilot isn't just an individual tool; it's designed to be a "team player." Not only does it provide code-related suggestions, but it can also link you to team members or external experts relevant to your current task. It's like having a virtual mentor by your side, ensuring that resources—be it human or code—are utilized effectively.

Taking User Privacy Seriously

The Q&A session touched upon the important issue of user privacy. Rest assured, Pieces Copilot keeps all data locally stored, and users will have the freedom to customize how they want the Copilot to interact with them: on-device, or connected to the cloud.

The Future of Pieces: Upcoming Features and Plans

During the AMA, several questions were raised about the future of the top generative AI companies like Pieces, what lies ahead for the Pieces Copilot, and how it plans to scale and monetize its features. Here’s a rundown of what you can expect:

Expanding to the Design World

One of the burning questions from the community was about making Pieces available for designers. Tsavo shared that the team is eager to expand beyond developer materials. After securing our Series A funding, the plan is to incorporate design-related assets such as artboards, layers, and color palettes into Pieces. This move aims to bridge the gap between developers and designers, echoing a broader industry trend toward a more unified digital supply chain.

A Transparent Discussion on Monetization Strategy

Being a venture-backed generative AI startup, the question of monetization is inevitable. Tsavo explained that the company's focus is on solving several sub-problems like snippet discovery, on-device models, and browser integration performance before introducing a paid model. We aim to create an experience that is so cohesive and valuable that users will willingly pay for it.

The plan is to roll out Pieces Pro and Pieces Pro Plus Cloud, both priced affordably to ensure users get value for their money. The basic Pro version is expected to cost around $7.99-$8.99 per month, and Pro Plus Cloud will be in the ballpark of $11.99. We are also considering usage-based pricing for teams and enterprises, perhaps introducing features like multiplayer copilots.

Metrics That Matter

Touching on the business side of things, Tsavo mentioned they are approaching a "golden ratio" in terms of key metrics like Daily Active Users (DAU), Weekly Active Users (WAU), and Monthly Active Users (MAU), along with retention and K-coefficients. Currently boasting 38,000 total users, the team believes they are on the brink of product stability and deep user resonance.

Workspaces and External Repositories

One of the most highly anticipated updates revolves around feature enhancements designed to streamline the user experience even further. The introduction of "workspaces" would allow users to better organize their development materials within the platform, grouping items by project, language, or other custom folders, similar to Google Drive.

But that's not all—we’re also working on integrating with external repositories. This new capability would significantly improve contextual awareness and suggestion capabilities, improving the generative AI code generation experience for developers.

Workflow Activity, Data Storage, and Security

Where is your Data Stored?

One of the most frequently asked questions during the AMA was about data storage, specifically for workflow activity. Mark clarified that all workflow activity is stored locally on the user's machine. This means that all actions taken within any plugin of the Pieces for Developers suite are saved right on your computer, offering a sense of security and independence from external servers. Even if you're offline, you can switch to the workflow activity view and continue to see your activity build up, making it a reliable feature.

Can You Export and Backup Your Data?

You can manually zip up your Pieces database to move it to a new device. Although the export is manual at the moment, Mark indicated that they're taking steps to automate this process, allowing for easier backups and even potentially adding the ability to sync between different devices in the coming months.

Open Source SDKs and Extensibility

Pieces is also potentially planning to open-source its SDKs sometime in the future. This will allow developers to build custom integrations into Pieces OS, which already runs multiple network servers locally on your machine, such as HTTP, gRPC, and websocket endpoints.

With over a hundred endpoints and around 150 data models, Pieces is built to scale and integrate with your unique workflow, whether you want to hook it up with "If This Then That" or build an entirely new app on top of Pieces OS. We believe that this idea is ahead of the curve in generative AI trends, offering users more flexibility in how they want to generate code.

Security Concerns Related to Sharing Sensitive Information

Sensitive data handling is a critical aspect of code management and sharing, thus making it a primary objective of generative AI startups in 2023. Pieces uses a multi-tiered algorithm, including basic regex and machine learning models, to detect sensitive data before you save or share a snippet. Users will be notified if they attempt to share code containing sensitive information, allowing them to make an informed decision.

Education, Best Practices, and Hardware Requirements

Pieces Copilot is designed to help users find, save, and organize code snippets more effectively. We recently introduced dynamic context to our copilot which personalizes it’s answers using context from throughout your workflow.

Mark and Tsavo also shed light on how Pieces Copilot goes beyond task-oriented operations and helps in fostering collaboration. It creates a cycle where you can generate code, save it, and then dive into the additional resources, along with context/metadata auto-enriched as you save, to enrich your coding knowledge. This is aimed at boosting productivity and ensuring code quality by offering quick access to documentation, policy guidelines, and AI-enriched insights.

Technical Requirements and Optimization

Startups in generative AI focused on local AI models are bound to come across the challenges of storage - therefore one huge topic we discussed were the technical requirements for running our software efficiently. We described how our desktop application has been designed to run smoothly both on CPU and GPU, providing flexibility for a variety of users.

GPU Acceleration

To run LLaMA 2 7B on GPU effectively, it is recommended to have a GPU with a minimum of 6GB VRAM. A suitable GPU example for this model is the RTX 3060, which offers a 8GB VRAM version. Other GPUs such as the GTX 1660, 2060, AMD 5700 XT, or RTX 3050, which also have 6GB VRAM, can serve as good options to support LLaMA 2 7B. Most modern deprecated GPUs should function well if the machine has appropriate VRAM and RAM.

CPU Support

We will soon release our first version of local qGPT (the engine that powers Pieces Copilot) with 4 models of CPU/GPU optimized LLaMA 2 7B, and CPU/GPU optimized Code LLaMA 2 7B. We recommend at least 6GB of RAM on your machine. Some good CPUs to run this on include: Intel Core i9-10900K, i7-12700K, or Ryzen 9 5900x. However, for better performance, you may want to use a more powerful CPU, such as an AMD Ryzen Threadripper 3990X with 64 cores and 128 threads. The more CPU cores you have, the faster it’ll run!

Overall, unless you have a dedicated GPU, it is recommended that you use the CPU model.

Memory Management and Future Enhancements

There was a comprehensive discussion about our plans to reduce RAM usage. We're working on smartly loading and unloading machine learning models from memory based on user behavior, thereby optimizing system performance.

Additionally, we noted that we reduced the size of updates for Pieces OS, providing a much better user experience as they no longer have to fully redownload the application, but only add updates to the existing bundle on the user’s computer.

Performance Benchmarks

Practical benchmarks were discussed, highlighting that a quad-core system with 2.7 GHz or higher and 16-32 gigs of RAM should suffice for smooth operation, even when running other demanding applications.

Automatic Context Setting & Re-Grounding

In the final portion of the Q&A, we touched on a highly relevant issue for developers: the automation of context setting. Known as "auto context" and "automatic re-grounding," these features aim to make context management almost invisible to the end user, essentially running in the background. This development is especially pivotal in multi-platform development environments like VS Code, desktop apps, and note-taking tools like Obsidian.

The idea is that, as you switch between different tasks and applications, Pieces Copilot will continuously "re-ground" itself, adapting the context automatically to suit the task at hand. So whether you're writing Python code, editing Markdown files, or pulling information from your repository, the artificial intelligence will adapt to serve you the most relevant snippets, tips, or code completions.

We also hinted at the ability to have "context presets" which can be manually set for specific code snippets, repositories, or even website URLs and videos. This feature is expected to be released in the back half of September.

Manual Context Setting in VS Code

For those who want manual control over the context, we unveiled a new VS Code feature that allows users to specify a context by simply typing phrases like "ask Pieces Copilot about X." Users can now ask Copilot to generate more endpoints, provide more comments, or even explain code, offering a highly customizable experience.

Snippet Discovery and Workflow Activity

We encourage users to import their snippets from local repositories to make their Copilot more intelligent and adaptive. The Workflow Activity feature provides a real-time log of your actions, making it easier to resume work after interruptions.

This log can include edited snippets, reclassified languages, and attached links among others, and it helps in training and "re-grounding" your Copilot. We believe this will be a huge focus for top generative AI startups focused on accurate code generation.

Generative AI for Enterprise Tasks Using Workflow Activity

One major future enhancement discussed was the ability to query your workflow activity directly from the Copilot, using questions like "What was I searching recently?" or "Who shared something with me recently?" These queries will sift through your workflow log and provide the most relevant events to assist you. This becomes incredibly useful for enterprises where tracking workflow activities could have broader implications for collaborative work and project management.

Pieces for Authors and Beyond

We wrapped up the Q&A by noting that the top generative AI tools aren’t going to be limited to code snippets. Whether you're a blog author or someone dealing with design materials, Pieces aims to operate at the "file fragment level," which we refer to as "pieces." Our repository, aptly named "Pieces for X," is designed to be forked and modified to suit different industries, exemplifying the scalability and broad vision of the product.

Conclusion: Thank You for Joining Us!

We'd like to extend our heartfelt thanks to everyone who joined us for the Pieces Copilot AMA session. Your questions and participation fuel our commitment to creating a generative AI future that serves you in the best possible way. A big shout-out also goes to our readers who stuck around to explore this comprehensive recap. We're thrilled to share our journey and future plans as one of the best generative AI startups with such an engaged and curious community!

Our team at Pieces for Developers continues to push the envelope on what's possible in the realm of intelligent code snippet management and AI-assisted coding. We're incredibly excited about the upcoming features and improvements that we've discussed today, and we hope you are too.

Keep an eye out for announcements about our next AMA; we look forward to diving deeper into what makes Pieces an indispensable tool for developers and how you can be a part of this exciting journey. Whether you're a current user with feedback, a developer intrigued by the possibilities, or someone just curious about the tech, we invite you to attend next time, and to join our community in Discord!

Once again, thank you for your invaluable participation and for being a part of our community. Until the next AMA, happy coding!

The Future of Generative AI Startups: Pieces Copilot AMA Recap and Highlights