Logging Experiment and Model Metrics
ModelBox integrates with metrics storage services to store training hardware, experiment and model metrics.
Python SDK
Metrics can be logged against any object in ModelBox - models, experiments, specific model versions, etc. A MetricValue
is logged for the object id at a given timestamp.
API
MetricValue
class MetricValue:
step: int
wallclock_time: int
value: Union[float, str, bytes]
The value could be a float to represent a scaler value or bytes or strings to represent serialized tensors.
The step
is optional and should be a real number if it represents the logical step at a given time of an experiment.
The wallclock
time is the physical clock time at which the metric was logged.
- SDK API
log_metrics(self, parent_id: str, key: str, value: MetricValue)
gRPC API
// Log Metrics for an experiment, model or checkpoint
rpc LogMetrics(LogMetricsRequest) returns (LogMetricsResponse);
// Get metrics logged for an experiment, model or checkpoint.
rpc GetMetrics(GetMetricsRequest) returns (GetMetricsResponse);
// Metrics contain the metric values for a given key
message Metrics {
string key = 1;
repeated MetricsValue values = 2;
}
// Metric Value at a given point of time.
message MetricsValue {
uint64 step = 1;
uint64 wallclock_time = 2;
oneof value {
float f_val = 5;
string s_tensor = 6;
bytes b_tensor = 7;
}
}
// Message for logging a metric value at a given time
message LogMetricsRequest {
string parent_id = 1;
string key = 2;
MetricsValue value = 3;
}
message LogMetricsResponse {}
message GetMetricsRequest {
string parent_id = 1;
}
message GetMetricsResponse {
repeated Metrics metrics = 1;
}