Kamua: cloud-based AI editor that helps video storytellers work faster

High-speed, browser accessible tools with smart automation for intelligently reframing video.
Comparison of Google AI’s AutoFlip to Kamua’s AutoCrop and a static Center Crop with no AI tracking.

One of the biggest challenges that video storytellers face in 2020 is taking video filmed for desktop and converting or “reframing” it for mobile portrait apps such as TikTok, Snapchat and Instagram. Spending countless hours dragging boxes around in buggy desktop software packages and waiting for projects to render is the usual grind that creators tell us about. That’s why we built Kamua.

Kamua is designed from the ground up to operate 100% in the cloud using the fastest GPUs available and uses no local computing resources. This results in huge reductions in manual workflows, saving countless hours for creative people. We have users who have cut manual workflows from 16 hours to 15 minutes by embracing the automation we have built over the past three years.

“Kamua has transformed our team’s ability to quickly produce social videos for our clients based on their broadcast TV ads. We cut our end to end production time by over 90% and we now produce multiple variants to A/B test, something we did not have the time to do before.”

– Kamua trial user
Kamua’s AI detects and tracks objects and uses machine learning to simulate how a videographer would have filmed the shot. This saves video storytellers critical time and frustration, and frees up them and their hardware to create more.

Kamua offers a wide range of features focused on time savings and ease of use. Cloud-based computing renders video outputs in minutes instead of hours; gone are the 200-hour learning curves that traditional editing software apps often require users to go through to become proficient in the kinds of editing techniques needed for professional social video creation. This approach empowers content marketers to produce professional grade results themselves and frees up video editors to focus on the nuances of storytelling.

How Does Kamua compare to Google AI’s AutoFlip?

Last week, Google AI released something they have been working on for some time: a machine learning based video reframing technology called AutoFlip. When we saw that blog post, we were pretty shocked to see that Google has been working on almost exactly the same approach to reframing and shot detection that we had developed — and they had decided to open source their framework. 

“As an early-stage startup, you would never assume your AI video tech is better than Google’s stealth tech; all you really know is how hard it has been to build your tech.”

We installed AutoFlip and ran it on a set of videos, multiple times. It’s not easy to write a blog post with a section about how your own startup is better than Google without seeming like braggarts…but Kamua is not just a bit better than AutoFlip, it’s much better. Possibly an order of magnitude better and faster. And it’s already deployed and accessible in a browser. As an early-stage startup, you would never assume your AI video tech is better than Google’s stealth tech; all you really know is how hard it has been to build your tech and, now, you also know that it’s market ready.

We detect and track objects in video better, we have multiple neural networks that collaborate to determine what the focal points of videos likely are, not just object detection, and we have better camera smoothing and pathing, to name a few.

We are sure that by open-sourcing their code, Google will see improvements to AutoFlip, but they are not open-sourcing some critical technologies that are in their (paid) Google Video Intelligence API. That makes it more likely that Google is using AutoFlip to generate use cases and data for Mediapipe, rather than trying to make AutoFlip part of a cloud product suite, but we don’t know and time will tell. What we can be 100% sure of is that Google recognises the huge demand for this kind of technology, has worked very hard, and if they are saying it’s important, we’re probably ready to launch at exactly the right time.

So, this week we decided that it’s time to let everyone know that they can access Machine Learning based video editing technology right now in a browser.
Kamua isn’t just a bit better than Google AutoFlip; it’s miles ahead.

How Kamua Works

We built and trained several different neural networks to replace the most monotonous and low added value tasks that video editors begrudgingly carry out every day. 

  1. Finding Cut Points in videos
  2. Cropping videos from Landscape format to Portrait and Square
  3. Re-arranging the order of scenes or shots in a video
Kamua’s proprietary AutoCut technology cuts videos into their component shots with no user intervention or threshold setting.

The user has the option of fully automating each process or manually intervening to override the decisions of the AI on a shot-by-shot basis. An example of this is the ability to instruct the AI to choose different objects, people or elements to focus on for the crop.
Kamua’s CustomTrack technology allows the user to click and drag a selection box around the object or area on which they wish to focus the crop.

Bleeding Edge Browser Tech

All a user needs is Google Chrome or Firefox (full Safari support coming soon) on a laptop and a wi-fi connection. Videos can be linked directly from a hosting source (such as YouTube or Vimeo) as long as the user has the rights to use the source videos, or uploaded by the user. It takes 20 minutes or less to learn how to use Kamua’s core features, whether a user has previous editing experience or none at all. Kamua pushes the limits of browser video seeking, playback and audio file management and we are excited to see how powerful they are becoming. We also plan to release mobile apps in the coming months.

Choosing different aspect ratios is as simple as clicking a drop-down menu or setting a custom value. Rendering currently takes 0.5s per second of source video footage, and the video renders in its original resolution.

Future & Experimental Features
Kamua will soon offer users experimental features such as Contextual Object Removal.

Deep Learning gives us incredible opportunities to offer game-changing technology to video marketers. In this rather impractical demonstration, we show how this technology can perform complex tasks such as removing the bicycle, the rider and (most of the) shadow from the scene. As the machine learning improves, it will be able to more accurately re-paint the areas that were previously obscured by the removed object.

This technology does become practical for the removal of hard-baked subtitles, graphics and other items that do not fit within a cropped video on a mobile screen, or that need to be removed or replaced.

We are also working on a variety of additional features, such as automated upscaling (480p to 4K), colorization, and color matching. Subscribe to our blog here to stay up to date.

We will be publishing more posts in the coming weeks and months that go into our tech and product in greater detail, and our YouTube channel already has some informative tutorials and demos on it. We hope you will subscribe and look forward to your feedback in the comments.

Full disclosure: Google nominated Kamua to its Google Cloud for Startups program, allowing us to operate class-leading GPU servers and deliver Kamua’s AI to our Beta testers since August 2019.

About Us: Kamua is a technology startup headquartered in London with engineering operations in Bucharest. We are hiring. Follow us on LinkedIn and check out our job postings there.

Leave a Reply

Your email address will not be published.