Skip to main content

Viewpoint Documentation

Introduction

Viewpoint is an AI-powered accessibility tool designed to accomplish tasks that traditional screen readers struggle to complete. By leveraging the power of AI, the tool allows you to navigate inaccessible software, read on-screen text, and view scanned or otherwise inaccessible PDF files.

Powered by Google's latest Gemini model, Viewpoint can identify UI elements and text on screen and present them in an accessible format. It is designed to run alongside your screen reader and offers three distinct modes:

  • UI Mode: Interact with inaccessible software.
  • OCR Mode: Identify all on-screen text.
  • PDF Reader: Read scanned or inaccessible PDF files.

Setup

Since Viewpoint is powered by Gemini, you will need a Gemini API key, which can be obtained from the link below. Google provides free access to the API with generous usage limits that should be more than sufficient for daily use of Viewpoint.

Get An API Key

After obtaining an API key, you can simply download the installer and follow the installation prompts.

When you run Viewpoint for the first time, it will prompt you for your API key. Paste it into the dialog that appears. Your API key is stored locally and never shared with anyone.

Global Hotkeys

Below is a list of common hotkeys used throughout Viewpoint.

Hotkey Action
Ctrl + Shift + \ Cycle Between Modes
Ctrl + Shift + / Activate Selected Mode
Ctrl + Alt + Shift + V Open Viewpoint Settings
Ctrl + Shift + F4 Close Viewpoint

UI Mode

UI mode allows you to navigate inaccessible software. When activated, Viewpoint takes a screenshot and identifies any on-screen UI elements.

Once the UI has been generated, you can navigate it with Tab and Shift + Tab. Pressing Enter or Space on an element will move the mouse to it and click on it. By holding Shift when activating an element, you can double-click the element instead.

After clicking on an element, Viewpoint will rebuild the UI after a short delay, allowing you to continuously use the mode without having to reactivate it. If you wish to exit UI mode without clicking an item, you can press Ctrl + Alt + Shift + /.

Note: For the best experience, make sure to maximize the window before activating UI Mode.

OCR Mode

OCR mode identifies all on-screen text and displays it in an accessible format. When activated, Viewpoint takes a screenshot and displays all detected text in a simple dialog.

You can navigate the dialog with standard text navigation commands. There is also a button to copy all of the detected text to the clipboard, enabling you to paste it somewhere else or save it to a file.

Note: Occasionally, the results dialog may not immediately gain focus. In such cases, you may need to use Alt + Tab to focus it before reading the detected text.

PDF Reader

The PDF reader extracts text from scanned or otherwise inaccessible PDF files and displays it in a simple dialog. When activated, Viewpoint prompts you to select the PDF file you wish to scan. After choosing a file, Viewpoint will extract and display all text in it.

You can navigate the dialog with standard text navigation commands. There is also a button to copy all of the detected text to the clipboard, enabling you to paste it somewhere else or save it to a file.

Note: Occasionally, the results and file upload dialogs may not immediately gain focus. In such cases, you may need to use Alt + Tab to focus them before attempting to interact with them.