TimeSeries data
Last updated
Was this helpful?
Last updated
Was this helpful?
Time series = Object + Tag + Metrics + actual data
Monitoring object could be in three categories:
Machine level: Physical machine, virtual machine, operation system
Instance level: Container, process
Service level (logical object): Service, service group, cluster
Metrics are numeric measurements. Metrics can include:
A numeric status at a moment in time (like CPU % used)
Aggregated measurements (like a count of events over a one-minute time, or a rate of events-per-minute)
The types of metric aggregation are diverse (for example, average, total, minimum, maximum, sum-of-squares), but all metrics generally share the following traits:
A name
A timestamp
One or more numeric values
Annotated key value pairs
Sequential read: Read by time range
Random write: Different time series data
Usually each object has a write sampling frequency is per 5s/10s.
Much more write than read
Lots of aggregating dimensions
If rowkey could be designed properly, then data could be distributed evenly into HRegions. And different HRegions could be located in different server nodes.
Benefits
entity_id and metric_id makes data evenly distributed.
timebase makes continous data next to each other.
Support tag aggregation
HBase does not have native support for index. This makes it impossible to find all entity_ids given a tag.
In the example below:
Give a tag: K1=V1, it could find all entities containing the tag: entity_id1, entity_id2, entity_id3
Vertical sharding
Each product has a different database
Horizontal partitioning
Slice name is Product - data - {starttime}
startime is the data starting time in table.
It will help remove data in batch.
Downsampling
Pre-downsampling
The longer the retention period is, the less data could be stored.
Post-downsampling
At query time, dynamically downsample data based on user assigned query range.
https://fabxc.org/tsdb/
https://www.twosigma.com/articles/building-a-high-throughput-metrics-system-using-open-source-software/
https://eng.uber.com/m3/
https://www.infoq.com/presentations/datadog-metrics-db/
https://www.youtube.com/watch?v=UEJ6xq4frEw&ab_channel=HasgeekTV