diff --git a/index.html b/index.html index 84e62ab..80c838c 100644 --- a/index.html +++ b/index.html @@ -110,6 +110,22 @@ .cost_text { margin-bottom: -5px; } + footer p { + color: var(--gray-500, #6B7280); + + /* text-xs/font-medium */ + font-family: Inter; + font-size: 12px; + font-style: normal; + font-weight: 500; + line-height: 16px; /* 133.333% */ + + margin: 0; + + text-align: center; + + margin-bottom: 24px; + } h2 { color: var(--gray-900, #111827); @@ -195,7 +211,7 @@ #lenny { position: absolute; right: 30%; - top: 26em; + top: 47.5%; } @media screen and (max-width: 600px) { .feature_cards { @@ -249,29 +265,13 @@ gtag('config', 'G-S0F5Y25KSC'); - -
A collection of experiments measuring the performance of GPT-4 Vision.
+Percentages measure how many of our tests passed.
Made with ❤️ by the team at Roboflow.
Last updated November 14, 2023.
Learn about our methodology. @@ -283,13 +283,12 @@Over the last 1 day - , the average response time was 1.0ms.
+Over the last 1 day, the average response time was 1.0ms.
+This number only accounts for requests made by this application.
100.0%
-Uptime
+1.0 ms
Validate GPT-4V's ability to classify objects.
-In this test, we test GPT-4V's ability to classify an object.
- What is in the image? Return the class of the object in the image. Here are the classes: fruit, bowl. You can only return one class from that list. + What is in the image? Return the class of the object in the image. Here are the classes: Toyota Camry, Tesla Model 3. You can only return one class from that list.
- + Toyota Camry
In this test, we test GPT-4V's ability to count objects.
Count the fruit in the image. Return a single number.
- + 10
- + I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times.
- + The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day old tea."
Every day, we run a set of tests to evaluate how GPT-4 Vision (GPT-4V) performs over time.
These tests are designed to monitor core features of GPT-4V.
-Each test runs the same prompt and image through GPT-4V and compares the answer to a human-written answer.
+Each test runs the same prompt and image through GPT-4V and compares the Result to a human-written Result.
While making this website, we experimented with prompts and chose the prompt that gave the most accurate results.
Tests are run at 1am PT every day. This site is updated when all tests are complete.
If a line is red, it means the test failed that day; if a line is green, the test passed.