Skip to main content

ChatGPT Gemini REST API

Using direct AI REST API instead of SDK

When working with large language models (LLMs) like ChatGPT and Gemini, developers often turn to Software Development Kits (SDKs) for streamlined integration. While SDKs offer convenience, there are compelling reasons to consider using the direct REST API, especially when starting new implementations like plugins or SDKs for a new language. This blog post will explore the advantages of this approach.

ChatGPT and Gemini SDK

SDKs simplify the process of interacting with LLMs. They provide pre-built functions and handle low-level details, making it easier to send prompts and receive responses. Let's look at examples for ChatGPT and Gemini:

ChatGPT SDK

The ChatGPT SDK for Node.js is available at https://github.com/openai/openai-node. Here's how to use it:


import OpenAI from 'openai';
const client = new OpenAI({apiKey: process.env['OPENAI_API_KEY']});
async function main() {
  const chatCompletion = await client.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }],
    model: 'gpt-3.5-turbo',
  });
}
main();
  

Gemini SDK

For Gemini, you can find the SDK at https://github.com/google-gemini/generative-ai-js. Here's a usage example:


const { GoogleGenerativeAI } = require("@google/generative-ai");
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
const prompt = "Does this look store-bought or homemade?";
const image = {
  inlineData: {
    data: Buffer.from(fs.readFileSync("cookie.png")).toString("base64"),
    mimeType: "image/png",
  },
};

const result = await model.generateContent([prompt, image]);
console.log(result.response.text());
  

SDKs abstract away the complexities of API requests and responses. You don't need to manually construct the request body, headers, or parse the response. This convenience comes at the cost of flexibility and control, which might be limiting in certain scenarios.

ChatGPT and Gemini REST API

Directly using the REST API gives you more control over the interaction with the LLMs. You send HTTP requests to specific endpoints and handle the responses. Here's how to do it for ChatGPT and Gemini:

ChatGPT REST API


curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
     "model": "gpt-4o-mini",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'
  

Gemini REST API


curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$GOOGLE_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{"text": "Give me python code to sort a list."}]
        }]
       }' 
  

While the REST API offers more control, it requires careful attention to detail. Incorrectly formatted requests will result in errors, and you're responsible for parsing the responses, including error handling.

Use the Direct REST API for New Implementations

When building a new tool or integration, especially for a niche use case or a new programming language, using the direct REST API can be advantageous. It allows for greater flexibility and control over the interaction with the LLM. You can tailor the requests and responses to your specific needs without being constrained by the limitations of an SDK.

Example: Neovim Plugin

In the code-ai.nvim plugin, the REST API is used directly to interact with ChatGPT and Gemini. This approach provides the flexibility needed to integrate LLM functionality seamlessly into the Neovim editor.


local curl = require('plenary.curl')
-- ...
-- much more code here
-- ...
function query.ask(instruction, prompt, opts, api_key)
  query.log("entered gemini query.ask")
  local api_host = 'https://generativelanguage.googleapis.com'
  local path = '/v1beta/models/gemini-1.5-pro-latest:generateContent'
  curl.post(api_host .. path,
    {
      headers = {
        ['Content-type'] = 'application/json',
        ['x-goog-api-key'] = api_key
      },
      body = vim.fn.json_encode(
        {
          system_instruction = {parts = {text = instruction}},
          contents = (function()
            local contents = {}
            table.insert(contents, {role = 'user', parts = {{text = prompt}}})
            return contents
          end)()
        }),
      callback = function(res)
        vim.schedule(function() query.askCallback(res, opts) end)
      end
    })
end
-- ...
-- much more code here
-- ...
function query.ask(instruction, prompt, opts, api_key)
  local api_host = 'https://api.openai.com'
  local path = '/v1/chat/completions'
  curl.post(api_host .. path,
    {
      headers = {
        ['Content-type'] = 'application/json',
        ['Authorization'] = 'Bearer ' .. api_key
      },
      body = vim.fn.json_encode(
        {
          model = 'gpt-4-turbo',
          messages = (function()
            local messages = {}
            table.insert(messages, { role = 'system', content = instruction })
            table.insert(messages, {role = 'user', content = prompt})
            return messages
          end)()
        }
      ),
      callback = function(res)
        vim.schedule(function() query.askCallback(res, opts) end)
      end
    })
end
  

Conclusion

While SDKs offer a convenient way to interact with LLMs like ChatGPT and Gemini, using the direct REST API provides greater flexibility and control, which is particularly beneficial when developing new integrations or targeting niche use cases. By understanding the trade-offs, developers can make informed decisions about the best approach for their specific projects.

Popular posts from this blog

npm run build base-href

Using NPM to specify base-href When building an Angular application, people usually use "ng" and pass arguments to that invocation. Typically, when wanting to hard code "base-href" in "index.html", one will issue: ng build --base-href='https://ngx.rktmb.org/foo' I used to build my angular apps through Bamboo or Jenkins and they have a "npm" plugin. I got the habit to build the application with "npm run build" before deploying it. But the development team once asked me to set the "--base-href='https://ngx.rktmb.org/foo'" parameter. npm run build --base-href='https://ngx.rktmb.org/foo did not set the base href in indext.html After looking for a while, I found https://github.com/angular/angular-cli/issues/13560 where it says: You need to use −− to pass arguments to npm scripts. This did the job! The command to issue is then: npm run build -- --base-href='https://ngx.rktmb.org/foo...

wget maven ntlm proxy

How to make wget, curl and Maven download behind an NTLM Proxy Working on CentOS, behind an NTLM proxy: yum can deal without problem with a NTLM Proxy wget, curl and Maven cannot The solution is to use " cntlm ". " cntlm " is a NTLM client for proxies requiring NTLM authentication. How it works Install "cntlm" Configure "cntlm"  by giving it your credentials by giving it the NTLM Proxy Start "cntlm" deamon (it listens to "127.0.0.1:3128") Configure wget, curl and Maven to use "cntlm" instead of using directly the NTLM Proxy Note: You will have then a kind of 2 stages Proxy : cntlm + the NTLM proxy Configure CNTLM After installing cntlm, the configuration file is in "cntlm.conf". You must have your domain (in the Windows meaning), proxy login and  proxy password. Mine are respectively: rktmb.org, mihamina, 1234abcd (yes, just for the example) You must have you NTLM Proxy Hostnama or IP ...

VMWare Keyboard Latency

Workstation VM UI lag when typing When using a VMWare Workstation VM, I noticed there is a latency when typing in the keyboard and the real appearance of the typed character. I searched and found: Noticeable typing lag in Linux VM terminals since v16.2 upgrade on Linux host To make it short, what solved it for me: Disable 3D acceleration in the VM setting .