This is kinda huge. Implemented new experimental scripting API, which allows to work with EyeAuras Computer Vision without using Auras at all.
All you have to do is write a few lines of code and all the tools built throughout the years are right at your disposal - image search, color search, neural networks, etc.
Note that this API is straight out of oven and is subject to change, not even mentioning bugs and problems - please send your reports
This is how your work with that new API could look like. For starters, lets find and image which is somewhere on the screen.
var cv = GetService<IComputerVisionExperimentalScriptingApi>()
.ForScreen() //search through entire screen, could be ForWindow - see below
.EnableOsd(fps: 10); //optional, with enabled on-screen-display refreshed @ 10 fps
var position = cv.ImageSearch(targetImagePath);
if (position.IsEmpty)
{
Log.Info($"Found @ {position}");
} else {
Log.Info($"Image not found");
}
That is it - no auras, no triggers, no actions. Everything is right there in these few lines of code.
Under the hood, EA will try to optimize all your calls as much as possible, e.g. cache images, cache models, pre-load everything, etc.
There is a lot of performance optimization work yet to be done in that area, but even the current state should be more than enough for most tasks.
Here is what we get from the get go:
Refresh()
This is an example of a bot which tracks and clicks on image (using AimTrainer)
Things to note on OSD:
60-80 hits/second
//this is an example of a script
//which uses image search to find and follow a specific image inside a window
//It uses 2-times probing for optimization - if we know the previous location of the image
//it makes sense to repeat the search in that specific region first
//picking a window
var windowSelector = GetService<IWindowSelector>();
windowSelector.TargetWindow = "Aim Trainer";
var targetWindow = windowSelector.ActiveWindow ?? throw new InvalidOperationException("Window not found");
//this API is used to do Computer Vision stuff (image/text/color/ml search)
var cv = GetService<IComputerVisionExperimentalScriptingApi>()
.ForWindow(targetWindow) //for a specific window
.EnableOsd(fps: 10); //with enabled on-screen-display refreshed @ 10 fps
//this API is used to generate mouse movements
var sendInput = GetService<ISendInputScriptingApi>();
sendInput.TargetWindow = targetWindow; //for a specific window
//could be either local file or URL
var targetImagePath = @"C:\Users\Xab3r\Documents\ShareX\Screenshots\2025-03\uupb4m3RQkfHa8sv.png";
//repeat until script is stopped
while (true)
{
//try to find the image
var position = cv.ImageSearch(targetImagePath);
if (position.IsEmpty)
{
//not found
continue;
}
Log.Info($"Found @ {position}");
//found the image - moving the mouse to the center and clicking on object
sendInput.MouseMoveTo(position.Center());
sendInput.MouseClick();
}
or, alternatively, this is how the same loop could look like using neural network.
ML is expected to be used by more experienced users, currently the method returns raw result in a form of WindowImageProcessedEventArgs - it provides maximum possible amount of control over extracted data
//as with the image, model path could be local or could point to remote file
var mlModelPath = @"https://s3.eyeauras.net/media/2025/03/AimLab_20240213193604OOBfW2f00U5K.onnx";
while (true)
{
//get predictions from the model
var result = cv.MLSearchRaw(mlModelPath);
if (!result.Detected.Predictions.Any())
{
//not found
continue;
}
var currentPosition = result //WindowImageProcessedEventArgs
.Detected //get detection results
.Predictions //get predictions
.First() //more specifically, first one
.Rectangle //get bounding box of that prediction IN LOCAL(aka World) coordinates
.Transform(result.ViewportTransforms.WorldToWindow); //transform World coordinates to Window coordinates
Log.Info($"Found via ML @ {currentPosition}");
if (!currentPosition.IsEmpty)
{
//found the image - moving the mouse to the center
sendInput.MouseMoveTo(currentPosition.ToWinRectangle().Center());
sendInput.MouseClick();
}
}
Alongside the new api, we needed more convenient tools which would allow to work with it.
The main idea of those is that you use them to get some values which could then be inserted into the code, be it coordinates, color, window title or process name.
This should greatly speedup process of development, especially for smaller scripts.
The list includes:
var pixelLocation = cv.PixelSearch(Color.FromArgb(26, 26, 26)); //<- Color.FromArgb(26, 26, 26) inserted by Color selector
var cv = GetService<IComputerVisionExperimentalScriptingApi>().ForWindow("l2.bin"); //<- "l2.bin" inserted by Process selector
cv.MLSearchRaw(mlModelPath, new Rectangle(719, 406, 988, 584)); //<- new Rectangle(719, 406, 988, 584) inserted by Region selector
sendInput.MouseMoveTo(new Point(190, 393)); //<- new Point(190, 393) inserted by Point selector
When you work with coordinates, do not forget that those could be either absolute(screen) or relative(window). Depending on use-case, one or another could be preferential. New tools support both - there are 2 different kinds of instruments, use at your own discretion.
SendInputUnstableScriptingApi
to SendInputScriptingApi
- there were no changes for almost a year, looks stable enoughPercentage
which could be used to denote that the value is in %
(e.g. 0.1
= 10%
)Opacity
WorldToWindow
transformation matrix to WindowImageProcessedEventArgs - this allows to very easily calculate in-window coordinates after Refresh()
R
and G
color channels were swapped