-
Notifications
You must be signed in to change notification settings - Fork 16
Under the Hood
If you wish to understand what this library is actually doing, reference this section where we dive into the details.
We are using the Camera2 API's. These are the new(ish) API's which replace the now deprecated Camera1 API's.
The first step to using the camera to capture images (frames) is to open a camera.
First you need a CameraManager
which can be retrieved from a Context
by:
context.getSystemService(CAMERA_SERVICE) as CameraManager
Next you need the id of the camera you wish to open. See fun selectCamera(): String
for our implementation, but you could use a specific characteristic about a camera to help you choose, or simply take the first one.
Now ask the CameraManager
to open the camera, passing an id and a callback so you know when the camera is opened (and can handle any errors):
cameraManager.openCamera(cameraId, callback, handler)
Within the callback you will be given a CameraDevice
, which you can think of as your camera.
To actually receive any frames from the camera you create a session and set a request. Think of a session as your photo shoot, and you requesting poses from your model.
To start a session call cameraDevice.createCaptureSession(surfaces, callback, handler)
, passing your surfaces and a callback so you know when your session is configured.
The surfaces you pass in, are where you want your frames to get sent to. For example a SurfaceView
to show a preview, or an ImageReader
to process a frame yourself.
Within fun onConfigured(session: CameraCaptureSession)
of your callback, you will then need to make your requests. You do this using a CaptureRequest.Builder
and then calling a request method on your CameraCaptureSession
. Which method you call depends on your situation, but for a preview to display to the user, you can use session.setRepeatingRequest(request, listener, handler)
.
We use MLKit to do barcode detection. See Google docs for more information on this.
We used to use Firebase MLKit before it was deprecated in favour of Google MLKit. So some architecture may be informed by that.
In order to make the code more understandable (and separate concerns), we have provided an interface (ImageProcessor
), base (ImageProcessorBase
) and a barcode implementation (BarcodeImageProcessor
). The combination of these keep track of whats currently being processed, call into MLKit to process the frame and report back the results in an abstracted manner.
We call into these classes from the Camera2Source
by adding an ImageReader
surface to the session request. Then using an OnImageAvailableListener
, we check if a frame is still currently being processed. If it is we just drop the frame (knowing another will be along in a minute anyway) or request that that frame be processed. Using two callbacks, onBarcodeListener
and onImageProcessed
we pass the results back and ensure that any frame get closed, ensuring we don't leak any resources.
When creating our ImageReader
surface we need decide the size we would like output. In order to what know sizes the camera can output we query the characteristics:
val characteristics = cameraManager.getCameraCharacteristics(cameraId)
val configs = characteristics.get(CameraCharacteristics.SCALER_STREAM_CONFIGURATION_MAP)
?: throw IllegalStateException()
val sizes = configs.getOutputSizes(IMAGE_FORMAT)
For the IMAGE_FORMAT
we use ImageFormat.YUV_420_888
, which Google recommend.
To choose a size requires two considerations:
- The smaller the size the quicker it is processed
- The larger the size the better quality the results will be
If you break down what a barcode is, it can give an indication of what sort of size you would require to get good results.
For example, EAN-13 barcodes are made up of bars and spaces that are 1, 2, 3, or 4 units wide. Therefore an EAN-13 barcode image ideally has bars and spaces that are at least 2, 4, 6, and 8 pixels wide respectively. 2 pixels for each unit providing sufficient width for recognition. Since an EAN-13 barcode is 95 units wide in total, the barcode should be at least 190 pixels (95 units * 2 pixels) wide.
Therefore depending on the barcode formats we support we can deduce a minimum width we would like our barcode image to be.
We also need to consider how far away the user holds the camera from the barcode. For example on a Google Pixel, holding an EAN-13 barcode (2.5cm in width) 5cm away from the camera lens means the barcode occupies roughly 50% of resulting image.
Combining these together results in:
val maxSizedFormat = barcodeFormats.maxBy { it.getMinWidth() ?: 0 }
val minWidth = maxSizedFormat?.getMinWidth() ?: BARCODE_FORMAT_ALL_MIN_WIDTH
val minWidthProportioned = minWidth / BARCODE_SCREEN_PROPORTION
val largeEnoughSizes = sizes.filter { it.width > minWidthProportioned }
return if (largeEnoughSizes.isNotEmpty()) {
largeEnoughSizes.minBy { it.width }
} else {
sizes.maxBy { it.width }
} ?: sizes[0]
See _Int.getMinWidth()
for the minimum widths for specific barcode formats.
To help Google process the frame they require a rotation to be passed into InputImage.fromMediaImage(image, rotation)
. To calculate this rotation we have to use three pieces of information:
- The devices orientation, which way up the user is holding the phone.
- The camera sensor orientation
- The facing direction of the camera
When device manufacturers build their phones they sometimes mount the camera sensors in different orientations (typically at 90 or 270 degrees) to help them manage the minimal space available inside the phone.
See BarcodeScanner.getRotationCompensation()
and Google Docs for the code and more information.
With Camera2 API's comes many options you can apply to your camera requests, from color correction to lens aperture.
The wide array of devices and manufacturers available means that there are many different features available for different devices. Not all devices have all the features. Therefore when adding options to your requests you should query the characteristics of the camera and select the appropriate setting.
To request the use of feature, add it to your request via the set method on CaptureRequest.Builder
:
builder.set(CaptureRequest.FEATURE_CONST, value)
CaptureRequest.CONTROL_AF_MODE
For a barcode scanner, we want the camera to continually seek the correct focus, therefore using CONTROL_AF_MODE_CONTINUOUS_PICTURE
is a good option, if it's available.
In combination with auto focus, we have also implemented a tap to focus feature. This allows the user to tap on the view and the camera will try to focus where the user has tapped.
This is a very complex feature and requires a few main steps:
- Recognise that the user has tapped on the screen. This is achieved with an
OnTouchListener
on theSurfaceView
. - Translate the users tap into coordinates which are relative to the camera sensor. This involves using the device rotation, camera sensor rotation, camera facing and camera sensor array. See
BarcodeScanner.calculateFocusRegions()
. - Cancel the ongoing repeating request (used for preview and image analysis);
session.stopRepeating()
. - Cancel any previous focus requests. This requires making a single capture request passing
CameraMetadata.CONTROL_AF_TRIGGER_CANCEL
. - Create a focus request. This is a single capture request with some specific focus related settings, including the coordinates calculated above. Primarily
CameraMetadata.CONTROL_AF_TRIGGER_START
andCaptureRequest.CONTROL_AF_REGIONS
. - Once the focus request has completed, restart the repeating request, passing the same coordinates.
- After a delay;
- Cancel the ongoing repeating request, as above.
- Cancel the focus request, as above.
- Restart the repeating request without the specific coordinates.
See https://github.com/brightec/KBarcode/pull/43 for more code details.
We have tried, wherever possible, to separate concerns. This has resulted in 4 main classes.
For the majority of users, this is view you will incorporate into your apps. It is the main touch point with the library.
It has responsibility for managing the SurfaceView
which renders the preview, and coordinating with the BarcodeScanner.
This class manages the connection between the CameraSource
and BarcodeImageProcessor
. It also manages the connection with any additional surfaces passed in, for example the SurfaceView
that the BarcodeView
passes in.
Some users may wish to interact directly with this class, if they have a special use case.
This class is responsible for managing the camera. For more information on the camera implementation see above.
This class is responsible for managing the processing of barcodes. For more information on the this see above.
Please do raise issues for suggesting improvements to this wiki.