Skip to main content

ChatGPT Gemini REST API

Using direct AI REST API instead of SDK

When working with large language models (LLMs) like ChatGPT and Gemini, developers often turn to Software Development Kits (SDKs) for streamlined integration. While SDKs offer convenience, there are compelling reasons to consider using the direct REST API, especially when starting new implementations like plugins or SDKs for a new language. This blog post will explore the advantages of this approach.

ChatGPT and Gemini SDK

SDKs simplify the process of interacting with LLMs. They provide pre-built functions and handle low-level details, making it easier to send prompts and receive responses. Let's look at examples for ChatGPT and Gemini:

ChatGPT SDK

The ChatGPT SDK for Node.js is available at https://github.com/openai/openai-node. Here's how to use it:


import OpenAI from 'openai';
const client = new OpenAI({apiKey: process.env['OPENAI_API_KEY']});
async function main() {
  const chatCompletion = await client.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }],
    model: 'gpt-3.5-turbo',
  });
}
main();
  

Gemini SDK

For Gemini, you can find the SDK at https://github.com/google-gemini/generative-ai-js. Here's a usage example:


const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
const prompt = "Does this look store-bought or homemade?";
const image = {
  inlineData: {
    data: Buffer.from(fs.readFileSync("cookie.png")).toString("base64"),
    mimeType: "image/png",
  },
};

const result = await model.generateContent([prompt, image]);
console.log(result.response.text());
  

SDKs abstract away the complexities of API requests and responses. You don't need to manually construct the request body, headers, or parse the response. This convenience comes at the cost of flexibility and control, which might be limiting in certain scenarios.

ChatGPT and Gemini REST API

Directly using the REST API gives you more control over the interaction with the LLMs. You send HTTP requests to specific endpoints and handle the responses. Here's how to do it for ChatGPT and Gemini:

ChatGPT REST API


curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
     "model": "gpt-4o-mini",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'
  

Gemini REST API


curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$GOOGLE_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{"text": "Give me python code to sort a list."}]
        }]
       }' 
  

While the REST API offers more control, it requires careful attention to detail. Incorrectly formatted requests will result in errors, and you're responsible for parsing the responses, including error handling.

Use the Direct REST API for New Implementations

When building a new tool or integration, especially for a niche use case or a new programming language, using the direct REST API can be advantageous. It allows for greater flexibility and control over the interaction with the LLM. You can tailor the requests and responses to your specific needs without being constrained by the limitations of an SDK.

Example: Neovim Plugin

In the code-ai.nvim plugin, the REST API is used directly to interact with ChatGPT and Gemini. This approach provides the flexibility needed to integrate LLM functionality seamlessly into the Neovim editor.


local curl = require('plenary.curl')
-- ...
-- much more code here
-- ...
function query.ask(instruction, prompt, opts, api_key)
  query.log("entered gemini query.ask")
  local api_host = 'https://generativelanguage.googleapis.com'
  local path = '/v1beta/models/gemini-1.5-pro-latest:generateContent'
  curl.post(api_host .. path,
    {
      headers = {
        ['Content-type'] = 'application/json',
        ['x-goog-api-key'] = api_key
      },
      body = vim.fn.json_encode(
        {
          system_instruction = {parts = {text = instruction}},
          contents = (function()
            local contents = {}
            table.insert(contents, {role = 'user', parts = {{text = prompt}}})
            return contents
          end)()
        }),
      callback = function(res)
        vim.schedule(function() query.askCallback(res, opts) end)
      end
    })
end
-- ...
-- much more code here
-- ...
function query.ask(instruction, prompt, opts, api_key)
  local api_host = 'https://api.openai.com'
  local path = '/v1/chat/completions'
  curl.post(api_host .. path,
    {
      headers = {
        ['Content-type'] = 'application/json',
        ['Authorization'] = 'Bearer ' .. api_key
      },
      body = vim.fn.json_encode(
        {
          model = 'gpt-4-turbo',
          messages = (function()
            local messages = {}
            table.insert(messages, { role = 'system', content = instruction })
            table.insert(messages, {role = 'user', content = prompt})
            return messages
          end)()
        }
      ),
      callback = function(res)
        vim.schedule(function() query.askCallback(res, opts) end)
      end
    })
end
  

Conclusion

While SDKs offer a convenient way to interact with LLMs like ChatGPT and Gemini, using the direct REST API provides greater flexibility and control, which is particularly beneficial when developing new integrations or targeting niche use cases. By understanding the trade-offs, developers can make informed decisions about the best approach for their specific projects.

Popular posts from this blog

npm run build base-href

Using NPM to specify base-href When building an Angular application, people usually use "ng" and pass arguments to that invocation. Typically, when wanting to hard code "base-href" in "index.html", one will issue: ng build --base-href='https://ngx.rktmb.org/foo' I used to build my angular apps through Bamboo or Jenkins and they have a "npm" plugin. I got the habit to build the application with "npm run build" before deploying it. But the development team once asked me to set the "--base-href='https://ngx.rktmb.org/foo'" parameter. npm run build --base-href='https://ngx.rktmb.org/foo did not set the base href in indext.html After looking for a while, I found https://github.com/angular/angular-cli/issues/13560 where it says: You need to use −− to pass arguments to npm scripts. This did the job! The command to issue is then: npm run build -- --base-href='https://ngx.rktmb.org/foo&

Jenkins invalid privatekey

Publish over SSH, Message "invalid privatekey:" With quite recent (June-July 2020) installations of Jenkins and OpenSSH, I have the following error message when using the "Deploy overs SSH" Jenkins plug-in and publishing artifacts to the target overs SSH: jenkins.plugins.publish_over.BapPublisherException: Failed to add SSH key. Message [invalid privatekey: [B@d8d395a] This problem seems to be referenced here: https://issues.jenkins-ci.org/browse/JENKINS-57495 Just regenerate a key with the right parameters To solve it: ssh-keygen -t rsa -b 4096 Or ssh-keygen -t rsa -b 4096 -m PEM

AzureCLI Custom Python

Installing Azure CLI on Archlinux When trying to install Azure CLI on Archlinux, I follow the documentation, in the "script" tab , and it leads to the following errors: [mihamina@arch-00 ~]$ curl -L https://aka.ms/InstallAzureCli | bash [...] Running install script. -- Verifying Python version. -- Python version 3.11.3 okay. [...] -- Executing: ['/usr/bin/python3', 'virtualenv.py', '--python', '/usr/bin/python3', '/home/mihamina/lib/azure-cli'] /tmp/tmpn0w4l6w9/virtualenv-16.7.11/virtualenv.py:24: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives import distutils.spawn /tmp/tmpn0w4l6w9/virtualenv-16.7.11/virtualenv.py:25: DeprecationWarning: The distutils.sysconfig module is deprecated, use sysconfig instead import distutils.sysconfig Already using interpreter /u