OCR (Optical Character Recognition)
Basic Introduction
OCR (Optical Character Recognition) technology refers to the use of artificial intelligence models to extract text information from images. Its core task is to convert text content in images into editable text data, which is widely applied in fields such as document digitization, information extraction, and data entry.
Currently, the OCR model available on Model Plaza includes: GOT-OCR2_0
GOT-OCR2_0
GOT-OCR2_0 provides a powerful OCR solution, capable of extracting text information from images with high accuracy, speed, and comprehensiveness. It supports multiple languages and is suitable for various scenarios, such as extracting information from ID cards, bank cards, PDF documents, tables, license plates, handwritten text, equipment nameplates, mathematical formulas, and other images.
Usage Method
You can click GOT-OCR2_0 for free online experience. The following are code invocation examples.
- Bash
- Python
curl https://moark.ai/v1/images/ocr \
-X POST \
-H "Authorization: Bearer 私人令牌" \
-F "model=GOT-OCR2_0"
-F "image=@path/to/image.jpg"
-F "response_format=text"
import requests
API_URL = "https://moark.ai/v1/images/ocr"
HEADERS = {
"Authorization": "Bearer 私人令牌",
}
def query(image_path, model="GOT-OCR2_0", response_format="text"):
with open(image_path, "rb") as image_file:
response = requests.post(
API_URL,
headers=HEADERS,
files={"image": (image_path, image_file)},
data={"model": model, "response_format": response_format},
)
return response.json()
output = query("test.jpg")
print(output) # {"text": "xxx"}
Parameter Description:
- Private token: Used to verify the identity of the caller. Click Private Token to obtain it.
- model: Fill in GOT-OCR2_0 to specify the use of the OCR large model.
- image: The image file to be processed with OCR.
- Supports
png,jpg,jpeg,webp,gifformats. - Maximum resolution support:
4096X4096. - File size does not exceed
3MB.
- Supports
- response_format: Format type.
- Value
textreturns plain text content. - Value
formatreturns content in mathpix-markdown format, recommended for images with mathematical formulas.
- Value
Usage Example

After executing the above code, the response will be:
{
"text": "美迪兰(南京)医疗设备有限公司\n名称:..."
}
The online experience effect of GOT-OCR2_0 is as follows:
