Xeoma can actually run fine in headless mode. They have a server mode and client mode.
Right now I've got a system running Xeoma on an arch server with over 70 cameras.
As for hardware to hopefully give you a reference point, the server has:
16 core AMD Epyc (I can't remember the exact part number now, but it's 2.8GHz with 32 threads)
64GB of RAM
and a 140 TB RAID array.
The client is just a random, and somewhat old, dell workstation.
as of right now, I don't have any motion detection going on, everything records 24/7 and last time I checked, the CPU is at about a 50% load. I can't remember the RAM usage, but I've got plenty of RAM left
the 140TB array can easily handle 40 days of footage, and every single camera is operating at at least 1920x1080 with varying frame rates. Many of the cameras are operating at higher resolutions, with 3 of them actually being 4k resolutions.
As for usability of Xeoma, I'd honestly recommend you find something else, especially if you're designing this project for a client or anyone other than yourself. I've noticed that with the number of cameras I'm running, the system is a little buggy. Sometimes the live preview of cameras just shows a green screen, but if you view the camera in fullscreen mode, it works fine. Connecting with VLC directly is fine too. Recording isn't affected either, but when your client walks in and sees a bunch of green previews, they think the whole system is down.
The interface is a bit clunky at times, and some things are missing that you would think should just be there (especially for the price of the software). For example, in the playback for recorded footage, theres no rewind button, or skip back 15 or 30 seconds or similar button, which makes viewing footage (and quickly showing law enforcement) a hassle. The other thing is: Let's say you know something happened at 2pm on a tuesday. So you go to that camera, and select 2pm on tuesday. Now, if you need to view the same event at the same time from another camera, it won't remember what time you were on. So you have to go through the process of selecting the time all over again. When you have to show the officers someone walking through the view of 12 different cameras, that gets old really fast.
There's a few other issues that don't come to mind immediately, but based on my experience with it, I don't think I could recommend Xeoma to people. Now, this is just my experience, maybe Xeoma really can't handle large amounts of cameras, or maybe there's some other random thing going on.
Right now Xeoma is at least recording things (mostly, had a weird issue the other day), but I'm going to setup some VMs on my tower to test out Kerberos, Bluecherry, and Motion.