Skip to content

Commit

Permalink
Create a template for all dimensions (netdata#6560)
Browse files Browse the repository at this point in the history
* health_connection: Comments inside Health Config

To try to understand better what is necessary to change and where it is necessary
to change anything inside the health, I commented the functions inside this file"
"

* health_connection: Comments about Health in other files

This commit brings the rest of the comments that were missed for health"

* health_connection: Comments on health_log

I had to append more comments on health_log

* health_connection: Create a new variable

New variable is created to work with foreach

* health_connection:  Fix new option and doc

The first implementation of the 'foreach' had a problem, this fixes the error.
This commit also brings the updates for the documentation

* health_connection:  Understanding health

This commit is to save the place that I am working, it has the map to understand all the alam process

* health_connection:  Update map

I changed the position of the error message to identify the correct place to add new alarms

* health_connection: End of simple alarm

This commit finishes what is necessary to bring the same lookup for different dimensions  in one unique line

* health_connection:  Documentation and template steps

This commit brings the documentation missed for template and comments to help in the next
step of apply a template to create an alarm.

* health_connection: Restoring

After some tests, it was detected that the alarms were not working as expected

* health_connection: Fix bug and bring dimension to template

This commit brings a fix for an old Netdata bug, before this the Netdata always tried to create
a new entry in an index with the same id raising an error.
It also brings the possibility to use 'foreach' in  template

* health_connection: Fix cmake compilation

There was a problem with cmake compilation fixed by this commit

* health_connection: shell script

Finilize the shell script to test the PR

* health_connection: Remove debug message

During the development, I used some messages to understand the code
this commit removes the last message

* health_connection: Fix bugs

This commits fix bugs reported by tests

* health_connection: Alarm working

This commit brings the necessary change for the alarms work, but it is missing the unlink from the newest list

* health_connection: Template code written

This commit finishes the creation of alarm from template, but it was not tested yet.

* health_connection: Remove comments

I am removing the comments from this PR to bring back late

* health_connection: Remove lines

Another commit to restore the files before they to be commented

* health_connection: New alarm and remove messages

I am bringing a new alarm to test template with SP and removing comments used during the development

* health_connection: Functional test review

After to review the functional test script, it was necessary to small adjust to
test all the features available with the new version

* health_connection: Free structure

I am moving the free list for the correct place, the previous place was not safe

* health_connection: ShellCheck

This commit fixes the problems with shellcheck

* health_connection: FIx hash

This commit fix the hash calculation that was using wrong input

* health_connection: Fix message error

The system was showing a wronge message, because when we have foreach
the alarm created with templated is added in a second stage to the index

* health_connection: Fix documentation

In this commit I am fixing the grammar of the previous doc and bringing
two examples

* health_connection: Fix examples

This commit fix the last two examples that was brought in this PR

* health_connection: Fix example doc

When I brought the correct grammar in the last commit, I lost a mark

* health_connection: Grammar fix

Fixing grammar of the documentation

* health_connection: Memory leak

This commit fixes the memory leak that was present in the PR

* health_connection: Reload

This commit fix the problem that the alarms were not linked after
to receive a SIGUSR2

* health_connection: False Positive from codacy

Codacy was given a false positive, I changed the function to avoid it.

* health_connection: dead code

Remove dead code from the code.

* health_connection: Memory Leak

Remove memory leak when clean simple pattern

* health_connection: Script format

With this commit I am formatting the last message to return
for the default color on terminal

* health_connection: Script format 2

With this commit I am formatting the last message to return
for the default color on terminal

* health_connection: Script format 3

With this commit I am formatting the error message to return
for the default color on terminal
  • Loading branch information
thiagoftsm authored and cakrit committed Sep 27, 2019
1 parent a8b28bf commit e3471fa
Show file tree
Hide file tree
Showing 20 changed files with 636 additions and 91 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ sitespeed-result/
tests/acls/acl.sh
tests/urls/request.sh
tests/alarm_repetition/alarm.sh
tests/template_dimension/template_dim.sh

# tests and temp files
python.d/python-modules-installer.sh
Expand Down
3 changes: 3 additions & 0 deletions database/rrd.h
Original file line number Diff line number Diff line change
Expand Up @@ -697,6 +697,7 @@ struct rrdhost {
// RRDCALCs may be linked to charts at any point
// (charts may or may not exist when these are loaded)
RRDCALC *alarms;
RRDCALC *alarms_with_foreach;
avl_tree_lock alarms_idx_health_log;
avl_tree_lock alarms_idx_name;

Expand All @@ -709,6 +710,7 @@ struct rrdhost {
// these are used to create alarms when charts
// are created or renamed, that match them
RRDCALCTEMPLATE *templates;
RRDCALCTEMPLATE *alarms_template_with_foreach;


// ------------------------------------------------------------------------
Expand Down Expand Up @@ -1008,6 +1010,7 @@ static inline time_t rrdset_slot2time(RRDSET *st, size_t slot) {
// ----------------------------------------------------------------------------
// RRD DIMENSION functions

extern void rrdcalc_link_to_rrddim(RRDDIM *rd, RRDSET *st, RRDHOST *host);
extern RRDDIM *rrddim_add_custom(RRDSET *st, const char *id, const char *name, collected_number multiplier, collected_number divisor, RRD_ALGORITHM algorithm, RRD_MEMORY_MODE memory_mode);
#define rrddim_add(st, id, name, multiplier, divisor, algorithm) rrddim_add_custom(st, id, name, multiplier, divisor, algorithm, (st)->rrd_memory_mode)

Expand Down
191 changes: 169 additions & 22 deletions database/rrdcalc.c
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,53 @@ inline uint32_t rrdcalc_get_unique_id(RRDHOST *host, const char *chart, const ch
return host->health_log.next_alarm_id++;
}

/**
* Alarm name with dimension
*
* Change the name of the current alarm appending a new diagram.
*
* @param name the alarm name
* @param namelen is the length of the previous vector.
* @param dim the dimension of the chart.
* @param dimlen is the length of the previous vector.
*
* @return It returns the new name on success and the old otherwise
*/
char *alarm_name_with_dim(char *name, size_t namelen, const char *dim, size_t dimlen) {
char *newname,*move;

newname = malloc(namelen + dimlen + 2);
if(newname) {
move = newname;
memcpy(move, name, namelen);
move += namelen;

*move++ = '_';
memcpy(move, dim, dimlen);
move += dimlen;
*move = '\0';
} else {
newname = name;
}

return newname;
}

/**
* Remove pipe comma
*
* Remove the pipes and commas converting to space.
*
* @param str the string to change.
*/
void dimension_remove_pipe_comma(char *str) {
while(*str) {
if(*str == '|' || *str == ',') *str = ' ';

str++;
}
}

inline void rrdcalc_add_to_host(RRDHOST *host, RRDCALC *rc) {
rrdhost_check_rdlock(host);

Expand Down Expand Up @@ -282,24 +329,39 @@ inline void rrdcalc_add_to_host(RRDHOST *host, RRDCALC *rc) {
rc->critical->rrdcalc = rc;
}

// link it to the host
if(likely(host->alarms)) {
// append it
RRDCALC *t;
for(t = host->alarms; t && t->next ; t = t->next) ;
t->next = rc;
}
else {
host->alarms = rc;
}
if(!rc->foreachdim) {
// link it to the host alarms list
if(likely(host->alarms)) {
// append it
RRDCALC *t;
for(t = host->alarms; t && t->next ; t = t->next) ;
t->next = rc;
}
else {
host->alarms = rc;
}

// link it to its chart
RRDSET *st;
rrdset_foreach_read(st, host) {
if(rrdcalc_is_matching_this_rrdset(rc, st)) {
rrdsetcalc_link(st, rc);
break;
// link it to its chart
RRDSET *st;
rrdset_foreach_read(st, host) {
if(rrdcalc_is_matching_this_rrdset(rc, st)) {
rrdsetcalc_link(st, rc);
break;
}
}
} else {
//link it case there is a foreach
if(likely(host->alarms_with_foreach)) {
// append it
RRDCALC *t;
for(t = host->alarms_with_foreach; t && t->next ; t = t->next) ;
t->next = rc;
}
else {
host->alarms_with_foreach = rc;
}

//I am not linking this alarm direct to the host here, this will be done when the children is created
}
}

Expand All @@ -311,13 +373,19 @@ inline RRDCALC *rrdcalc_create_from_template(RRDHOST *host, RRDCALCTEMPLATE *rt,

RRDCALC *rc = callocz(1, sizeof(RRDCALC));
rc->next_event_id = 1;
rc->id = rrdcalc_get_unique_id(host, chart, rt->name, &rc->next_event_id);
rc->name = strdupz(rt->name);
rc->hash = simple_hash(rc->name);
rc->chart = strdupz(chart);
rc->hash_chart = simple_hash(rc->chart);

rc->id = rrdcalc_get_unique_id(host, rc->chart, rc->name, &rc->next_event_id);

if(rt->dimensions) rc->dimensions = strdupz(rt->dimensions);
if(rt->foreachdim) {
rc->foreachdim = strdupz(rt->foreachdim);
rc->spdim = health_pattern_from_foreach(rc->foreachdim);
}
rc->foreachcounter = rt->foreachcounter;

rc->green = rt->green;
rc->red = rt->red;
Expand Down Expand Up @@ -361,7 +429,7 @@ inline RRDCALC *rrdcalc_create_from_template(RRDHOST *host, RRDCALCTEMPLATE *rt,
error("Health alarm '%s.%s': failed to re-parse critical expression '%s'", chart, rt->name, rt->critical->source);
}

debug(D_HEALTH, "Health runtime added alarm '%s.%s': exec '%s', recipient '%s', green " CALCULATED_NUMBER_FORMAT_AUTO ", red " CALCULATED_NUMBER_FORMAT_AUTO ", lookup: group %d, after %d, before %d, options %u, dimensions '%s', update every %d, calculation '%s', warning '%s', critical '%s', source '%s', delay up %d, delay down %d, delay max %d, delay_multiplier %f, warn_repeat_every %u, crit_repeat_every %u",
debug(D_HEALTH, "Health runtime added alarm '%s.%s': exec '%s', recipient '%s', green " CALCULATED_NUMBER_FORMAT_AUTO ", red " CALCULATED_NUMBER_FORMAT_AUTO ", lookup: group %d, after %d, before %d, options %u, dimensions '%s', for each dimension '%s', update every %d, calculation '%s', warning '%s', critical '%s', source '%s', delay up %d, delay down %d, delay max %d, delay_multiplier %f, warn_repeat_every %u, crit_repeat_every %u",
(rc->chart)?rc->chart:"NOCHART",
rc->name,
(rc->exec)?rc->exec:"DEFAULT",
Expand All @@ -373,6 +441,7 @@ inline RRDCALC *rrdcalc_create_from_template(RRDHOST *host, RRDCALCTEMPLATE *rt,
rc->before,
rc->options,
(rc->dimensions)?rc->dimensions:"NONE",
(rc->foreachdim)?rc->foreachdim:"NONE",
rc->update_every,
(rc->calculation)?rc->calculation->parsed_as:"NONE",
(rc->warning)?rc->warning->parsed_as:"NONE",
Expand All @@ -387,18 +456,94 @@ inline RRDCALC *rrdcalc_create_from_template(RRDHOST *host, RRDCALCTEMPLATE *rt,
);

rrdcalc_add_to_host(host, rc);
RRDCALC *rdcmp = (RRDCALC *) avl_insert_lock(&(host)->alarms_idx_health_log,(avl *)rc);
if (rdcmp != rc) {
error("Cannot insert the alarm index ID %s",rc->name);
if(!rt->foreachdim) {
RRDCALC *rdcmp = (RRDCALC *) avl_insert_lock(&(host)->alarms_idx_health_log,(avl *)rc);
if (rdcmp != rc) {
error("Cannot insert the alarm index ID %s",rc->name);
}
}

return rc;
}

/**
* Create from RRDCALC
*
* Create a new alarm using another alarm as template.
*
* @param rc is the alarm that will be used as source
* @param host is the host structure.
* @param name is the newest chart name.
* @param dimension is the current dimension
* @param foreachdim the whole list of dimension
*
* @return it returns the new alarm changed.
*/
inline RRDCALC *rrdcalc_create_from_rrdcalc(RRDCALC *rc, RRDHOST *host, const char *name, const char *dimension) {
RRDCALC *newrc = callocz(1, sizeof(RRDCALC));

newrc->next_event_id = 1;
newrc->id = rrdcalc_get_unique_id(host, rc->chart, name, &rc->next_event_id);
newrc->name = (char *)name;
newrc->hash = simple_hash(newrc->name);
newrc->chart = strdupz(rc->chart);
newrc->hash_chart = simple_hash(rc->chart);

newrc->dimensions = strdupz(dimension);
newrc->foreachdim = NULL;
rc->foreachcounter++;
newrc->foreachcounter = rc->foreachcounter;

newrc->green = rc->green;
newrc->red = rc->red;
newrc->value = NAN;
newrc->old_value = NAN;

newrc->delay_up_duration = rc->delay_up_duration;
newrc->delay_down_duration = rc->delay_down_duration;
newrc->delay_max_duration = rc->delay_max_duration;
newrc->delay_multiplier = rc->delay_multiplier;

newrc->last_repeat = 0;
newrc->warn_repeat_every = rc->warn_repeat_every;
newrc->crit_repeat_every = rc->crit_repeat_every;

newrc->group = rc->group;
newrc->after = rc->after;
newrc->before = rc->before;
newrc->update_every = rc->update_every;
newrc->options = rc->options;

if(rc->exec) newrc->exec = strdupz(rc->exec);
if(rc->recipient) newrc->recipient = strdupz(rc->recipient);
if(rc->source) newrc->source = strdupz(rc->source);
if(rc->units) newrc->units = strdupz(rc->units);
if(rc->info) newrc->info = strdupz(rc->info);

if(rc->calculation) {
newrc->calculation = expression_parse(rc->calculation->source, NULL, NULL);
if(!newrc->calculation)
error("Health alarm '%s.%s': failed to parse calculation expression '%s'", rc->chart, rc->name, rc->calculation->source);
}

if(rc->warning) {
newrc->warning = expression_parse(rc->warning->source, NULL, NULL);
if(!newrc->warning)
error("Health alarm '%s.%s': failed to re-parse warning expression '%s'", rc->chart, rc->name, rc->warning->source);
}

if(rc->critical) {
newrc->critical = expression_parse(rc->critical->source, NULL, NULL);
if(!newrc->critical)
error("Health alarm '%s.%s': failed to re-parse critical expression '%s'", rc->chart, rc->name, rc->critical->source);
}

return newrc;
}

void rrdcalc_free(RRDCALC *rc) {
if(unlikely(!rc)) return;


expression_free(rc->calculation);
expression_free(rc->warning);
expression_free(rc->critical);
Expand All @@ -407,11 +552,13 @@ void rrdcalc_free(RRDCALC *rc) {
freez(rc->chart);
freez(rc->family);
freez(rc->dimensions);
freez(rc->foreachdim);
freez(rc->exec);
freez(rc->recipient);
freez(rc->source);
freez(rc->units);
freez(rc->info);
simple_pattern_free(rc->spdim);
freez(rc);
}

Expand Down
11 changes: 9 additions & 2 deletions database/rrdcalc.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ struct rrdcalc {
uint32_t next_event_id; // the next event id that will be used for this alarm

char *name; // the name of this alarm
uint32_t hash;
uint32_t hash; // the hash of the alarm name

char *exec; // the command to execute when this alarm switches state
char *recipient; // the recipient of the alarm (the first parameter to exec)
Expand All @@ -59,7 +59,11 @@ struct rrdcalc {
// database lookup settings

char *dimensions; // the chart dimensions
RRDR_GROUPING group; // grouping method: average, max, etc.
char *foreachdim; // the group of dimensions that the `foreach` will be applied.
SIMPLE_PATTERN *spdim; // used if and only if there is a simple pattern for the chart.
int foreachcounter; // the number of alarms created with foreachdim, this also works as an id of the
// children
RRDR_GROUPING group; // grouping method: average, max, etc.
int before; // ending point in time-series
int after; // starting point in time-series
uint32_t options; // calculation options
Expand Down Expand Up @@ -148,7 +152,10 @@ extern void rrdcalc_unlink_and_free(RRDHOST *host, RRDCALC *rc);
extern int rrdcalc_exists(RRDHOST *host, const char *chart, const char *name, uint32_t hash_chart, uint32_t hash_name);
extern uint32_t rrdcalc_get_unique_id(RRDHOST *host, const char *chart, const char *name, uint32_t *next_event_id);
extern RRDCALC *rrdcalc_create_from_template(RRDHOST *host, RRDCALCTEMPLATE *rt, const char *chart);
extern RRDCALC *rrdcalc_create_from_rrdcalc(RRDCALC *rc, RRDHOST *host, const char *name, const char *dimension);
extern void rrdcalc_add_to_host(RRDHOST *host, RRDCALC *rc);
extern void dimension_remove_pipe_comma(char *str);
extern char *alarm_name_with_dim(char *name, size_t namelen, const char *dim, size_t dimlen);

static inline int rrdcalc_isrepeating(RRDCALC *rc) {
if (unlikely(rc->warn_repeat_every > 0 || rc->crit_repeat_every > 0)) {
Expand Down
36 changes: 24 additions & 12 deletions database/rrdcalctemplate.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,35 @@

// ----------------------------------------------------------------------------
// RRDCALCTEMPLATE management
/**
* RRDCALC TEMPLATE LINK MATCHING
*
* @param rt is the template used to create the chart.
* @param st is the chart where the alarm will be attached.
*/
void rrdcalctemplate_link_matching_test(RRDCALCTEMPLATE *rt, RRDSET *st, RRDHOST *host ) {
if(rt->hash_context == st->hash_context && !strcmp(rt->context, st->context)
&& (!rt->family_pattern || simple_pattern_matches(rt->family_pattern, st->family))) {
RRDCALC *rc = rrdcalc_create_from_template(host, rt, st->id);
if(unlikely(!rc))
info("Health tried to create alarm from template '%s' on chart '%s' of host '%s', but it failed", rt->name, st->id, host->hostname);
#ifdef NETDATA_INTERNAL_CHECKS
else if(rc->rrdset != st && !rc->foreachdim) //When we have a template with foreadhdim, the child will be added to the index late
error("Health alarm '%s.%s' should be linked to chart '%s', but it is not", rc->chart?rc->chart:"NOCHART", rc->name, st->id);
#endif
}
}

void rrdcalctemplate_link_matching(RRDSET *st) {
RRDHOST *host = st->rrdhost;
RRDCALCTEMPLATE *rt;

for(rt = host->templates; rt ; rt = rt->next) {
if(rt->hash_context == st->hash_context && !strcmp(rt->context, st->context)
&& (!rt->family_pattern || simple_pattern_matches(rt->family_pattern, st->family))) {
RRDCALC *rc = rrdcalc_create_from_template(host, rt, st->id);
if(unlikely(!rc))
info("Health tried to create alarm from template '%s' on chart '%s' of host '%s', but it failed", rt->name, st->id, host->hostname);
rrdcalctemplate_link_matching_test(rt, st, host);
}

#ifdef NETDATA_INTERNAL_CHECKS
else if(rc->rrdset != st)
error("Health alarm '%s.%s' should be linked to chart '%s', but it is not", rc->chart?rc->chart:"NOCHART", rc->name, st->id);
#endif
}
for(rt = host->alarms_template_with_foreach; rt ; rt = rt->next) {
rrdcalctemplate_link_matching_test(rt, st, host);
}
}

Expand All @@ -43,6 +55,8 @@ inline void rrdcalctemplate_free(RRDCALCTEMPLATE *rt) {
freez(rt->units);
freez(rt->info);
freez(rt->dimensions);
freez(rt->foreachdim);
simple_pattern_free(rt->spdim);
freez(rt);
}

Expand All @@ -67,5 +81,3 @@ inline void rrdcalctemplate_unlink_and_free(RRDHOST *host, RRDCALCTEMPLATE *rt)

rrdcalctemplate_free(rt);
}


Loading

0 comments on commit e3471fa

Please sign in to comment.