Description
Google MediaPipe's face and hand landmark detection functionalities are essentially wrappers around OpenCV, which slows down the API. Proper use of the API requires additional solving as there are essentially no helper functions for solid HPR and XYZ extraction from the landmarks. The landmark API interface is inconsistent due to the fact that Mediapipe started as a 'hack' of OpenCV instead of an algorithm from the ground up, although the attempt is on-target. Acceleration can be introduced by OpenCV platform agnostic GPU, accelerated matrices, and polling. Training both back, left, right, top, bottom and front of face could improve detection as currently only the front upper half (quarter sphere) segment is trained; nevertheless, introducing horizontal and quaternion-pitch can compensate the front bottom half. The hand detection could be improved by training closed- fist and open- hand landmarks from different angles. Direct use of OpenCV without Mediapipe, or different algorithms, might be a more stable alternative.