Vision Library API
Here are the vision methods for reading and processing the screen.
macro_studio.vision
captureScreenText
Captures a region of the screen and extracts text using Tesseract OCR.
This method performs a screen grab via MSS, converts the buffer to a grayscale binary image for better contrast, and then processes it through the Tesseract engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bounds
|
QRect
|
The rectangular area of the screen to read from. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The extracted text string, stripped of leading/trailing whitespace. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the Tesseract OCR binary is not installed at the path specified in 'pytesseract.pytesseract.tesseract_cmd'. |
captureScreenColor
Captures the QColor of a specific pixel on the screen.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
point
|
QPoint
|
The specific pixel location to read from. |
required |
Returns:
| Type | Description |
|---|---|
QColor
|
The QColor of the specified pixel. |
isColorSimilar
Checks if two colors are within a certain Euclidean distance in RGB space.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
color_a
|
QColor
|
The first color to compare (usually captured from the screen). |
required |
color_b
|
QColor
|
The second color to compare (usually the target variable). |
required |
tolerance
|
int
|
The maximum Euclidean distance allowed between colors. 0 is an exact match, 10-20 is tight, 50+ is loose. |
10
|
Returns:
| Type | Description |
|---|---|
bool
|
True if the distance between the two colors is <= tolerance, False otherwise. |
isColorSimilarPerceptual
Checks if two colors are within a certain weighted RGB space based on human perception.
Best for distinguishing between subtle UI shades (e.g., 'Active' vs 'Inactive' buttons).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
color_a
|
QColor
|
The first color to compare (usually captured from the screen). |
required |
color_b
|
QColor
|
The second color to compare (usually the target variable). |
required |
tolerance
|
int
|
The maximum Euclidean distance allowed between colors. 0 is an exact match, 10-20 is tight, 50+ is loose. |
10
|
Returns:
| Type | Description |
|---|---|
bool
|
True if the distance between the two colors is <= tolerance, False otherwise. |
isBrightnessSimilar
Checks if the lightness/luminance of two colors are similar.
Best for detecting if a screen region flashes, dims, or highlights, regardless of the actual color hue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
color_a
|
QColor
|
The first color to compare (usually captured from the screen). |
required |
color_b
|
QColor
|
The second color to compare (usually the target variable). |
required |
tolerance
|
int
|
The maximum Euclidean distance allowed between colors. 0 is an exact match, 10-20 is tight, 50+ is loose. |
10
|
Returns:
| Type | Description |
|---|---|
bool
|
True if the distance between the two colors is <= tolerance, False otherwise. |
findImageCenter
findImageCenter(template_path: str, bounds: QRect | None = None, threshold: float = 0.8) -> tuple[QPoint, float] | None
Finds an image template on the screen and return its absolute center coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template_path
|
str
|
Path to the template image. |
required |
bounds
|
QRect | None
|
The bounds to search for the template in. If no bounds are provided, it searches the entire primary monitor. |
None
|
threshold
|
float
|
Confidence threshold to consider the result as a potential match. |
0.8
|
Returns:
| Type | Description |
|---|---|
tuple[QPoint, float] | None
|
The absolute center coordinates of the found template object and the confidence score, or None if not found. |
getScreenState
Capture a region and return it as a BGR numpy array for custom processing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bounds
|
QRect | None
|
The region to capture. If |
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
A BGR numpy array for custom processing. |