Begining to see Apple: Developer Perspective

The vision framework was presented by Apple in 2017 in WWDC as part of iOS 11. Its launch is a turning point in the development of the machine vision and image analysis, providing developers original tools for analyzing visual content and the performance of subsequent treatment as needed.
In 2017, the vision was presented:
- Learn about the text
- Confession of the face
- Discovering rectangular forms
- Learn about the barcode icon and a fast response code
Since its first appearance, Apple has been constantly strengthened by the vision framework, ensuring its development to meet modern requirements. By the end of 2024, with iOS 18, Vision is now presented:
- Improving the accuracy of the text recognition with support for a large number of languages
- Discover faces and their features
- The ability to analyze movements
- The ability to learn about clarification, including the position of the hands and the main points of the human body
- Video track tracking support
- Improving integration with Coreml to work with special machinery learning models
- Deep integration with relevant frameworks, such as Avkit, Arkit
With the emergence of vision frame, developers gained the ability to perform the tasks of analyzing images and videos already, without relying on third -party solutions. These capabilities include scanning documents, identifying the text, identifying faces and subtraction, discovering repeated images, and automating the various processes that simplify commercial operations.
In this article, we will consider the main scenarios to use the vision with examples of symbols that will help you understand how to work with it, and we understand that it is not difficult, and start applying it in practice in your applications.
Vnrequest
Vision has an abstract class Vnrequest Which determines the structures of data request in the vision, and the implementation of the offspring categories is specific requests to perform specific tasks with a picture.
All sub -categories inherit the dominant from Vnrequest season.
public init(completionHandler: VNRequestCompletionHandler? = nil)
Which re -treats the request. It is important to make it clear that the result of the request will be returned in the same waiting list in which the application was sent.
where Vnrequestcompletionhandler It is Typealias.
public typealias VNRequestCompletionHandler = (VNRequest, (any Error)?) -> Void
Which repeats a Vnrequest With the results of the request or an error if the application is not carried out due to some system errors, an incorrect image, etc.
the vnrenizextrequest The chapter of the summary Vnrequest The chapter is designed to deal with text recognition requests in the images.
An example of the application to get to know the text:
import Vision
import UIKit
func recognizeText(from image: UIImage) {
guard let cgImage = image.cgImage else { return }
let request = VNRecognizeTextRequest { request, error in // 1
guard let observations = request.results as? [VNRecognizedTextObservation] else { return } // 2
for observation in observations {
if let topCandidate = observation.topCandidates(1).first {
print("Recognized text: \(topCandidate.string)")
print("Text boundingBox: \(observation.boundingBox)")
print("Accuracy: \(topCandidate.confidence)")
}
}
}
request.recognitionLevel = .accurate
request.usesLanguageCorrection = true
let handler = VNImageRequestHandler(cgImage: cgImage, options: [:]) // 3
try? handler.perform([request]) // 3
}
-
construction vnrenizextrequest To learn about the text.
-
Receive the results of the text to recognize the text as sects for VnractlizedTextobservation objects.
the vnrectlizedtextobservation The object contains:
- A collection of recognized texts (VNRCOGONZEDTEXT (). series))
- The accuracy of confession (vnrectlizedtext (). trust))
- The coordinates of the text recognized on the image (vnrectlizedtext ()))
-
Create a photo processing request, and send a request to learn about the text.
-
example: Acknowledging the tax definition number and passport number when developing your SDK to learn about documents
vndetectvestanglesrequest
This chapter finds faces in the image and return its coordinates.
An example of the application of the face recognition request:
import Vision
import UIKit
func detectFaces(from image: UIImage) {
guard let cgImage = image.cgImage else { return }
let request = VNDetectFaceRectanglesRequest { request, error in // 1
guard let results = request.results as? [VNFaceObservation] else { return } // 2
for face in results {
print("Face detected: \(face.boundingBox)")
}
}
let handler = VNImageRequestHandler(cgImage: cgImage, options: [:]) // 3
try? handler.perform([request]) // 3
}
-
construction vndetectvestanglesrequest To get to know the face in a picture.
-
Receive the results of the text to identify the text as the sects from Vnfacebservation Things.
the Vnfacebservation The object contains:
-
Confessed face coordinates VnFaceBSERVATION (). BoundingBox.
-
Create a photo processing request and send a request to recognize the face.
-
example: In banks, there is Kyc on the plane where you have to take a picture of your passport; This way, you can emphasize that this is the face of a real person.
Vndtectbarcodesrequest
This category gets to know Barcodes and QR symbols from an image.
An example of implementing a request to identify the barcode icon and QR code:
import Vision
import UIKit
func detectBarcodes(from image: UIImage) {
guard let cgImage = image.cgImage else { return }
let request = VNDetectBarcodesRequest { request, error in // 1
guard let results = request.results as? [VNBarcodeObservation] else { return } // 2
for qrcode in results {
print("qr code was found: \(qrcode.payloadStringValue ?? "not data")")
}
}
let handler = VNImageRequestHandler(cgImage: cgImage, options: [:]) // 3
try? handler.perform([request]) // 3
}
-
construction Vndtectbarcodesrequest To confess.
-
Get results vnbarcodeobservation Request the object of the object.
the vnbarcodeobservation The object has many properties, including:
-
VnFaceBSERVATION (). Payloadstringvalue The value of the barcode chain or the fast response code.
-
Create a photo processing request, and send a request to recognize the face.
-
example: QR Scanner to read QR icons to pay.
We have covered the three main types of information in the vision to help you start this strong tool.