Rabbit R1 "AI" Exposed: Crafting a Rabbit R1 Style Food Ordering

by Rok Rak, Software Engineer

Picture of rabbit

Introduction

Welcome to the twilight zone of tech innovation, where groundbreaking advancements are promised at every turn, and yet, we often end up circling back to the basics. Today, we’re peeling back the layers of the Rabbit R1, a device that has hopped into the spotlight, not just for its capabilities, but for its spectacular parade of promises. Marketed as the next big leap in AI, Rabbit R1 was supposed to be your go-to digital butler, fetching everything from Uber rides to your morning bagel with just a simple command. But as we’ve learned, sometimes what sounds like AI is actually just A-Lie or an "Actual Indian" behind the curtain pulling the levers.

Inspired by the great detective work of Coffeezilla, who put the Rabbit R1 under the microscope only to find it's more rabbit hole than revolutionary, this blog will guide you through creating your very own 'advanced AI' or should we say, cleverly scripted food ordering application. Let’s dive into the world of "advanced" technology where the only real machine learning is figuring out that sometimes, the machine isn't learning anything at all.

Prerequities

Ensure you have a Wolt account with your address, phone number, and credit card details updated for seamless food ordering.

Setting Up the Backend with Playwright

To simulate advanced AI functionalities, we'll employ Playwright, a robust automation library that allows us to script interactions with web applications. Here's how to get started:

  1. Initialize your Node.js Project First, make sure Node.js is installed on your machine. Then, in your project directory, initialize a new Node.js project:

    npm init -y
    
  2. Install Dependencies

    npm install @fastify/cors axios dotenv fastify playwright
    

    These packages will allow you to:

    • Use @fastify/cors to handle Cross-Origin Resource Sharing (CORS) settings.
    • Manage HTTP requests efficiently with axios.
    • Secure and manage environment variables with dotenv.
    • Create an HTTP server with fastify.
    • Automate browser tasks with playwright.
  3. Install Development Dependencies For development, we need types for better code reliability and tools for continuous development without restarting the server manually. Install your development dependencies:

    npm install --save-dev @types/node nodemon ts-node typescript
    
  4. Configure Start Script Set up nodemon with ts-node in your package.json to automatically restart and compile your TypeScript application during development, using the script

    "start": "nodemon --watch \"src/**\" --ext \"ts,json\" --ignore \"src/**/*.spec.ts\" --exec \"ts-node src/app.ts\"",
    

Awesome! Now we are ready to start coding our food ordering service.

Start building api server

Lets first implement our api routes.

import fastify from 'fastify'
import { createOrder, orderFood } from './service/createOrder'
import cors from '@fastify/cors'
import { ConfirmOrder, Order } from './types'
import 'dotenv/config'

const server = fastify()

server.register(cors, {
  origin: '*',
})

server.post('/order', async (request, reply) => {
  const order = request.body as Order
  const confirmOrder = await createOrder(order)
  reply.send(confirmOrder)
})

server.post('/confirmOrder', async (request, reply) => {
  const order = request.body as ConfirmOrder
  const orderStatus = await orderFood(order)
  reply.send(orderStatus)
})

server.listen({ port: 8080 }, (err, address) => {
  if (err) {
    console.error(err)
    process.exit(1)
  }
  console.log(`Server listening at ${address}`)
})

Order API Endpoint Functionality

The /order endpoint is pivotal in initiating the food ordering process. When this endpoint is called, it performs several critical functions to ensure the user is prepared to confirm their order:

Returning Meal Details: The API responds with essential information about the selected meal. This includes:

  • id: This is a unique identifier (number) for the meal, which is crucial for efficiently adding the item to the basket in subsequent steps, thus bypassing repetitive searches.
  • mealImage: An image of a meal returned as base64. This visual confirmation is provided so the user can verify that the selected meal matches their expectations, enhancing user confidence before finalizing the order.
  • pageUrl: The URL of the restaurant's page. Including this streamlines the process by directing the user or automated system directly to the restaurant’s page, eliminating the need for additional search steps in the order confirmation phase.

ConfirmOrder API Endpoint Functionality

The /confirmOrder endpoint serves as the final step in the food ordering process. Upon receiving the final confirmation from the user, this endpoint completes the order with the restaurant and provides the user with essential feedback regarding the order status. Here’s what the response contains:

  • status: The status field in the response indicates whether the order was successful or if there was an error during the process. This immediate feedback is crucial for ensuring that users are not left in uncertainty about the outcome of their action.
  • message: Accompanying the status, the message provides a descriptive explanation of what occurred. If the order was successful, it might confirm that the order has been placed. If there was an error, the message will detail the issue so the user can attempt to resolve it, whether that means retrying the order, checking their payment information, or contacting customer support.
  • waitingTime: This field provides an estimated range of time (e.g., "15-20 minutes") indicating how long it will take for the meal to be prepared and ready for pickup or delivery. Providing an estimated waiting time helps manage customer expectations and allows them to plan accordingly.

Create order

The createOrder function is pivotal in our food ordering service's backend, interfacing directly with the restaurant's webpage to automate order placement. Here’s how it operates:

export interface Order {
  restaurant: string
  meal: string
}

export interface ConfirmOrder {
  id: number | null
  mealImage?: string
  pageUrl: string
}

export const createOrder = async (order: Order): Promise<ConfirmOrder> => {
  // Initialize a persistent Firefox browser context
  const browser = await firefox.launchPersistentContext('./tmp', {
    headless: false,
    args: ['--disable-dev-shm-usage'],
    viewport: {
      width: 1250,
      height: 1440,
    },
  })

  try {
    // Create a new page in the browser
    const page = await browser.newPage()

    // Disable navigation timeout
    await page.setDefaultNavigationTimeout(0)

    // Ensure user is logged into Wolt
    await loginToWoltAccountIfNecessary(page)

    // Extract order details
    const { meal, restaurant } = order

    // Navigate to the restaurants page and find the specified restaurant
    const found = await goToRestaurants(page, restaurant)
    if (!found) throw new Error('Restaurant not found')

    // Find and select the specified meal
    const confirmOrder = await getMeal(page, meal)
    if (!confirmOrder) throw new Error('Meal not found')

    // Return the confirmation details of the selected meal
    return confirmOrder
  } catch (err) {
    // Handle any errors that occur during the order creation process
    return {
      id: null,
      mealImage: '',
      pageUrl: '',
      error: (err as Error).message as string,
    }
  } finally {
    // Ensure the browser is closed, even if an error occurred
    await browser.close()
  }
}

Key Functions:

  • loginToWoltAccountIfNecessary: Ensures the user is logged into their Wolt account, handling authentication to enable order placement.
  • goToRestaurants: Navigates to the specific restaurant's page based on the user’s choice.
  • getMeal: Locates and selects the desired meal from the menu, returning details for order confirmation.

This function orchestrates the sequence of steps required to place an order through a restaurant’s web interface. It manages navigation, selection, and data retrieval, simplifying the user's role to merely confirming the details returned. By automating this process, the service enhances efficiency and reduces the potential for human error, ensuring a seamless ordering experience.

Restaurant Selection: A Dash of AI Magic

When it comes to choosing a restaurant, our app doesn't just compare strings, it consults with our AI buddy, ChatGPT. Think of it as having a good friend who knows just where to eat, even if you don't know how to spell it :D

Scouting the Scene

Our first step is to canvas Wolt’s array of offerings. We collect details like the restaurant name, its URL, and whether it's open:

const fetchRestaurants = async (page: Page): Promise<Restaurant[]> => {
  // Find all restaurant elements on the page
  const restaurantList = await page.$$('div.cb-elevated')

  // Map each restaurant element to a Restaurant object
  return await Promise.all(
    restaurantList.map(async (el, i) => {
      return {
        id: i, // Assign an ID based on the index
        ...(await el.evaluate((restaurant) => {
          return {
            name: (restaurant as any).querySelector('h3')?.textContent, // Extract restaurant name
            url: (restaurant as any).querySelector('a')?.href, // Extract restaurant URL
            isOpened: Array.from(
              (restaurant as any).querySelectorAll('div'),
            ).find((el) => (el as any).innerText === 'min')
              ? true
              : false, // Check if restaurant is open based on the presence of "min" text
          }
        })),
        element: el, // Store the ElementHandle for future use
      }
    }),
  )
}

Implementing revolutionary AI

Next, we turn to ChatGPT to sift through the options and pinpoint the right match. This ensures you get to the right place without fuss:

export const fetchDesiredRestaurantOrMeal = async (
  items: string,
  searchedItem: string,
) => {
  try {
    const response = await openAIRequest({
      model: 'gpt-3.5-turbo',
      messages: [
        {
          role: 'system',
          content: `Given the list of items below, determine the ID of the item closely matching the name provided: '${searchedItem}'. Return the ID in JSON format, for example, {"id": 10}. If a matching item is not found return {"id": null}. Ensure to consider slight inaccuracies in the item name during the search.`,
        },
        {
          role: 'user',
          content: items,
        },
      ],
    })

    const data: any = await response.json()
    return data.choices[0].message.content
  } catch (error) {
    console.error(error)
  }
}

With a pinch of ChatGPT's intelligence, we can proudly place "AI" before our food ordering service app name.

Confirming Order

Finalizing an order involves a few more steps in our application, ensuring that the meal you've chosen gets into your cart without any hitches. Here's how we handle it:

Adding the Meal to the Cart

When you've settled on a meal, our app diligently adds it to your cart. Here’s the procedure it follows:

const addMealToCart = async (page: Page, pageUrl: string, mealId: number) => {
  // Navigate to the restaurant page
  await page.goto(pageUrl)

  // Wait for the page to load
  await page.waitForLoadState('networkidle')

  // Wait for a key element that indicates the page is fully rendered
  await page.waitForSelector("div[data-test-id='horizontal-item-card']", {
    state: 'visible',
    timeout: 10000,
  })

  // Fetch all meals and find the selected one
  const allMeals: Meal[] = await fetchMeals(page)
  const selectedMeal = allMeals.find((m) => m.id === mealId)
  if (!selectedMeal) throw new Error('Meal not found')

  // Find and click the meal image to open the modal
  const popupImg = await selectedMeal.element.$('img')
  if (!popupImg) throw new Error('Meal image not found')
  await popupImg.click()

  // Wait for the modal to open and the "Add to cart" button to be visible
  await page.waitForSelector("button[data-test-id='product-modal.submit']", {
    state: 'visible',
    timeout: 5000,
  })

  // Click the "Add to cart" button
  await page.click("button[data-test-id='product-modal.submit']")

  // Wait for the cart to update
  await page.waitForSelector("button[data-test-id='cart-view-button']", {
    state: 'visible',
    timeout: 5000,
  })
}

Checking Out

Once the meal is securely in your cart, the next step is to proceed to checkout:

const goToCheckout = async (page: Page, pageUrl: string) => {
  // Navigate to the checkout page
  await page.goto(pageUrl + '/checkout')

  // Wait for the URL to be fully loaded, with network activity idle
  await page.waitForURL(pageUrl + '/checkout', {
    waitUntil: 'networkidle',
    timeout: DEFAULT_TIMEOUT,
  })

  // Select the checkout button using the data-test-id attribute
  const checkoutButton = await page.$(
    "button[data-test-id='BackendPricing.SendOrderButton']",
  )

  // Throw an error if the checkout button is not found
  if (!checkoutButton) throw new Error('Checkout button not found')

  // Click the checkout button to place the order
  await checkoutButton!.click()

  // Wait for the order status element to appear on the page
  await page.waitForSelector("div[data-test-id='OrderStatus']", {
    timeout: DEFAULT_TIMEOUT,
  })

  // Select the order status element
  const orderStatus = await page.$("div[data-test-id='OrderStatus']")

  // Get the text content of the first child of the order status element
  const firstChild = await orderStatus?.evaluate(
    (el) => el.firstChild?.textContent,
  )

  // Return the order status information
  return {
    status: EOrderStatus.CONFIRMED,
    message: 'Order placed',
    waitingTime: firstChild || '',
  }
}

And that’s it! We’ve just created our own AI food ordering service in Node.js, capable of navigating restaurant menus and placing orders with ease. While there are still some edge cases to refine and additional features we could explore, the purpose here was just to demonstrate how straightforward it is to implement your own AI-assisted food ordering service.

Setting Up The Frontend

To develop our React application, we're using Vite, a modern and fast build tool that enhances development efficiency with features like hot module replacement and rich built-in features. Here’s how to set up your project:

npm create vite@latest app -- --template react
cd app
npm install

Other packages used in this application can be found in the source code we linked belowe the article.

Creating our AIFriend component

The AIFriend component is at the heart of our application, enabling users to interact with the service using their voice. This component is built using React and integrates seamlessly with the speech recognition and text-to-speech APIs to create a dynamic user experience. Overview of the AIFriend Component

The AIFriend component uses the react-speech-recognition library to listen to user commands and the TextToSpeech component from our utilities to provide audible feedback. This synergy creates an engaging two-way communication stream where users can speak their choices and receive spoken confirmations. Key Features of the AIFriend Component

  1. Voice Activation: Users can start their order by simply speaking to the application. Whether it’s selecting a restaurant or choosing a meal, all interactions can be performed hands-free.

  2. Dynamic Interaction: Based on the user’s responses, the AIFriend component guides them through the ordering process. Each response from the user helps the application determine the next step in the conversation.

  3. Error Handling: If the desired input is not recognized or is unavailable, the component smartly prompts the user to repeat or change their response, ensuring a smooth user experience without frustration.

  4. Visual Feedback: Alongside auditory cues, the component features a visually appealing interface with icons that respond to the user’s interaction state—spinning when processing and glowing when awaiting further input.

Implementing the Component

The AIFriend component is implemented with a series of functional steps, each corresponding to a part of the ordering process:

Step 1: Selecting a Restaurant: The component listens and records the user’s preferred restaurant. Step 2: Choosing a Meal: Once a restaurant is confirmed, it prompts the user to select a meal. Step 3: Order Confirmation: It asks for confirmation before placing the order, ensuring accuracy. Step 4: Completing the Order: The order details are sent to the backend to be processed, and the user receives confirmation that their order has been placed.

import SpeechRecognition, {
  useSpeechRecognition,
} from "react-speech-recognition";
import TextToSpeech from "../components/TextToSpeech";
import { Dispatch, SetStateAction, useEffect, useState } from "react";
import { CircleIcon } from "../assets/icons";
import {
  ConfirmOrder,
  IOrderStatus,
  Order,
  sendConfirmOrder,
  sendOrder,
} from "../api/order";

enum RobotText {
  Restaurant = "Please select a restaurant.",
  Meal = "Please select a meal.",
  Order = "Is this a meal that you ordered?",
  ThanksForOrdering = "Thank you for confirming your order. Ordering now...",
  OrderSuccess = "Your order was successful.",
}

enum RobotErrorText {
  Restaurant = "Sorry, we could not find the restaurant. Please repeat desired restaurant.",
  Meal = "Sorry, we could not find the meal. Please repeat desired meal.",
  MealError = "Sorry, we could not find the meal. Please repeat desired meal.",
  OrderError = "Sorry, we could not find the order. Please repeat desired order.",
}

interface Props {
  setOrderImage: Dispatch<SetStateAction<string>>;
  setOrderStatus: Dispatch<SetStateAction<IOrderStatus | null>>;
}

export const AIFriend: React.FC<Props> = ({
  setOrderImage,
  setOrderStatus,
}) => {
  const [step, setStep] = useState<number | null>(null);
  const [robotText, setRobotText] = useState("");
  const [speechEnded, setSpeechEnded] = useState(false);
  const [order, setOrder] = useState<Order>({ restaurant: "", meal: "" });
  const [confirmOrder, setConfirmOrder] = useState<ConfirmOrder>(
    {} as ConfirmOrder
  );
  const [loading, setLoading] = useState(false);

  const { transcript, finalTranscript, resetTranscript } =
    useSpeechRecognition();

  useEffect(() => {
    if (speechEnded) {
      setRobotText("");
      setSpeechEnded(false);

      if (step === 5) return;
      SpeechRecognition.startListening();
    }
  }, [speechEnded]);

  useEffect(() => {
    const processTranscript = async () => {
      switch (step) {
        case 1:
          if (order.restaurant) return;
          handleFirstStep();
          break;
        case 2:
          if (order.meal) return;
          await handleSecondStep();
          break;
        case 3:
          await handleThirdStep();
          break;
        case 4:
          if (!confirmOrder.pageUrl) return;
          await handleFourthStep();
          break;
        case 5:
          await handleFifthStep();
          break;
      }
      resetTranscript();
    };
    if (finalTranscript || step === 5) {
      processTranscript();
    }
  }, [step, order, finalTranscript]);

  const handleFirstStep = () => {
    setOrder({ ...order, restaurant: finalTranscript });
    handleStep(2, RobotText.Meal);
  };

  const handleSecondStep = async () => {
    setLoading(true);
    setOrder({ ...order, meal: finalTranscript });

    const { id, mealImage, pageUrl, error } = await sendOrder({
      ...order,
      meal: finalTranscript,
    });

    // If restaurant not found go back to step 1
    if (error === "Restaurant not found") {
      setOrder({ restaurant: "", meal: "" });
      handleStep(1, RobotErrorText.Restaurant);
      setLoading(false);
      return;
    }

    // If meal not found go back to step 2
    if (!id) {
      setOrder({ ...order, meal: "" });
      handleStep(2, RobotErrorText.Meal);
      setLoading(false);
      return;
    }
    setConfirmOrder({ id, pageUrl });
    handleStep(3, RobotText.Order);
    setOrderImage(mealImage!);
    setLoading(false);
  };

  const handleThirdStep = async () => {
    setOrderImage("");
    if (finalTranscript === "yes") {
      setLoading(true);
      handleStep(4, RobotText.ThanksForOrdering);
    } else {
      setOrder({ ...order, meal: "" });
      handleStep(2, RobotErrorText.MealError);
    }
  };

  const handleFourthStep = async () => {
    setLoading(true);
    setOrder({ restaurant: "", meal: "" });
    setConfirmOrder({} as ConfirmOrder);
    const orderStatus = await sendConfirmOrder(confirmOrder);

    if (orderStatus.status === "confirmed") {
      handleStep(5, RobotText.OrderSuccess);
    } else {
      handleStep(5, RobotErrorText.OrderError);
    }
    setOrderStatus(orderStatus);
    setLoading(false);
  };

  const handleFifthStep = () => {
    setOrder({ restaurant: "", meal: "" });
    setConfirmOrder({} as ConfirmOrder);
    setStep(null);

    setTimeout(() => {
      setOrderStatus(null);
    }, 3000);
  };

  const handleStep = (step: number, text: string) => {
    setRobotText(text);
    setStep(step);
  };

  const getAppropriateCircleClass = () => {
    if (robotText || loading) {
      return {
        ani: "motion-safe:animate-spin-slow",
        color: "#67e8f9",
      };
    }
    if (!robotText && step != null) {
      return {
        ani: "motion-safe:animate-scale",
        color: "#4ade80",
      };
    }
    return {
      ani: "",
      color: "#0ea5e9",
    };
  };

  return (
    <div className="flex flex-col items-center relative w-[500px]">
      <div
        onClick={() => handleStep(step ? step + 1 : 1, RobotText.Restaurant)}
        className="w-max h-max relative cursor-pointer hover:scale-105 duration-300"
      >
        <CircleIcon size={120} classes={getAppropriateCircleClass()} />
        {!step && (
          <div className="absolute top-1/2 left-1/2 transform -translate-x-1/2 -translate-y-1/2 text-white text-lg font-semibold">
            START
          </div>
        )}
      </div>

      <div className="absolute bottom-[-20px]">
        {robotText && (
          <TextToSpeech text={robotText} setSpeechEnded={setSpeechEnded} />
        )}
        <div className="text-[#4ade80]">{transcript}</div>
      </div>
    </div>
  );
};

Conclusion

We've reached the end of our journey exploring the practical use of scripted automation in food ordering, inspired by the simplicity behind the Rabbit R1's facade. This project has demonstrated that what often appears as complex AI can often be achieved with well-structured scripts and existing technologies.

By dissecting and rebuilding the process, we’ve shown how to leverage tools like Playwright and Node.js to enhance functionality without deep AI algorithms. Our AIFriend component exemplifies how integrating simple technologies can create user-friendly and efficient solutions.

We hope this exploration inspires you to understand and utilize technology creatively. Dive into the source code, experiment, and see how you can apply these concepts in your own projects. Remember, innovation often involves rethinking how to use the tools at hand creatively.

Source Code can be found here

More articles

How to Integrate OpenAI API Across Diverse Applications

Learn how to leverage OpenAI API across different sectors, including E-Commerce, Healthcare, Education, Finance, and Media & Entertainment. Explore practical example of integrating OpenAI API into a healthcare application for enhanced patient communication.

Read more

Elon Musk Unveils Grok-1.5: A Leap Towards Matching GPT-4's Intelligence

Explore the launch of Grok-1.5, Elon Musk's xAI breakthrough, offering unparalleled reasoning and problem-solving prowess. Bridging the gap to GPT-4, Grok-1.5 excels in benchmarks, promising to redefine AI's capabilities. Discover its enhancements and impact on AI, ready for release this week.

Read more

Let’s Discuss Your Project

Contact

Company
WebZone d.o.o.
VAT: SI90485661
Slovenia