Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--fileoutput writes to stdout instead of file #213

Closed
theboringstuff opened this issue Feb 26, 2024 · 21 comments · Fixed by #255
Closed

--fileoutput writes to stdout instead of file #213

theboringstuff opened this issue Feb 26, 2024 · 21 comments · Fixed by #255

Comments

@theboringstuff
Copy link

theboringstuff commented Feb 26, 2024

Describe the bug
It seems --fileoutput no longer works in 1.7.0 after #162

Instead of writing to file, it writes to stdout twice. The reason is most probably recursive call to shadowed print

To Reproduce

  1. Run KRR with simple -f yaml --fileoutput <file>
  2. Recommendations are printed twice in stdout and not printed in file

Expected behavior
Recommendations are printed once in stdout and once in file

Are you interested in contributing a fix for this?
Yes

@aantn
Copy link
Contributor

aantn commented Mar 2, 2024

Hey, this should be fixed by #218. Can you please confirm it works as expected?

@theboringstuff
Copy link
Author

Tried to run from branch with fix, but unfortunately got another error which do not reproduce on 1.6.0

ERROR    Failed to load pods for Deployment ###: 'CustomPrometheusConnect' object has no attribute 'safe_custom_query'                                           loader.py:76
        Traceback (most recent call last):                                                                                                                                                                      
          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/loader.py", line 74, in load_pods                                                                             
            return await self.loader.load_pods(object, period)                                                                                                                                                  
          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py", line 183, in load_pods                                        
            replicasets = await self.query(                                                                                                                                                                     
          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py", line 111, in query                                            
            return await loop.run_in_executor(self.executor, lambda: self.prometheus.safe_custom_query(query=query))                                                                                            
          File "/usr/lib/python3.9/concurrent/futures/thread.py", line 58, in run                                                                                                                               
            result = self.fn(*self.args, **self.kwargs)                                                                                                                                                         
          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py", line 111, in <lambda>                                         
            return await loop.run_in_executor(self.executor, lambda: self.prometheus.safe_custom_query(query=query))                                                                                            
        AttributeError: 'CustomPrometheusConnect' object has no attribute 'safe_custom_query' 

I provide URL to prometheus using -p option

@aantn
Copy link
Contributor

aantn commented Mar 4, 2024 via email

@theboringstuff
Copy link
Author

Upgrading requirements helped with the above issue. But then I encountered two more:

  1. Issue with loading metrics
    [11:14:04] ERROR    Failed to load pods for Deployment ###: string indices must be integers                                                 loader.py:76
                        Traceback (most recent call last):                                                                                                                                                                      
                          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/loader.py", line 74, in load_pods                                                                             
                            return await self.loader.load_pods(object, period)                                                                                                                                                  
                          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py", line 194, in load_pods                                        
                            pod_owners = [replicaset["metric"]["replicaset"] for replicaset in replicasets]                                                                                                                     
                          File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py", line 194, in <listcomp>                                       
                            pod_owners = [replicaset["metric"]["replicaset"] for replicaset in replicasets]                                                                                                                     
                        TypeError: string indices must be integers   
    
    I fixed that manually by chaning source code - changed replicasets to replicasets["result"]. It seems underlying data structure was changed. It also should be fixed for related_pods_result and pods_status_result structures
  2. After above issue was fixed manually, KRR finished its work, the report file was produced, but the report was not present in stdout. The last line in the stdout is following
               INFO     Result collected, displaying...                                                                                                                                                            runner.py:250
    
    In v1.6.0 the report was also printed in the stdout, now it is not. I am not using --quite, I am using --fileoutput ./report.yaml -f yaml --verbose. So I would expect both stdout and file contain the report

@aantn
Copy link
Contributor

aantn commented Mar 4, 2024

Thanks for the update. Regarding (1) - that was fixed on master already. I've merged it into this branch now so you shouldn't need a change anymore.

Yes, it would make more sense to print to both locations when --quite is not passed. I'll keep this issue open for now.

LeaveMyYard added a commit that referenced this issue Mar 4, 2024
Fix --quiet and json/yaml output (fixes #214 and #213)
@aantn
Copy link
Contributor

aantn commented Mar 14, 2024

@theboringstuff is this fixed in #231?

@theboringstuff
Copy link
Author

Trying to test from branch, for some reason KRR is getting infinitely stuck after following line

           INFO     Calculated recommendations for Deployment *** (using 4 metrics)                                                        runner.py:176

It is not happening on 1.7.1 tag

I am running command like

krr simple --verbose -k ./kubeconfig -p $PROM_URL --prometheus-label $KEY -l $VALUE --timeframe_duration 5 --history_duration 10 -w 1 -n $NAMESPACE --fileoutput test.yaml -f yaml --prometheus-auth-header="$AUTH_HEADER"

@theboringstuff
Copy link
Author

theboringstuff commented Mar 15, 2024

In addition to above, I am encountering strange issue on 1.7.1 with stdout:

  1. When I run KRR from bash, scans are shown in stdout in yaml format
  2. When I run KRR from go, scan are not show in stdout in yaml format

In both cases I execute the same command, but the stdout is different. I suspect this is something with buffers flushing or something like that, because after I added logger.info("PRINTING") to runner.custom_print it started working from both bash and go. This is confusing

@aantn
Copy link
Contributor

aantn commented Mar 15, 2024 via email

@theboringstuff
Copy link
Author

theboringstuff commented Mar 18, 2024

I mean running programmatically from go language using os/exec package. I am always running KRR as following:

krr simple --verbose -k ./kubeconfig -p $PROM_URL --prometheus-label $KEY -l $VALUE --timeframe_duration 5 --history_duration 10 -w 1 -n $NAMESPACE --fileoutput test.yaml -f yaml --prometheus-auth-header="$AUTH_HEADER"

Above command behave differently on different versions:

  1. On v1.6.0 it always completes and scans are present in stdout AND file
  2. On v1.7.1 it depends. If I run from bash - it works correctly as in v1.6.0. But if I run the same command from go using standard os/exec package functions, there is no scans in stdout, the output ends on Results collected, displaying (but the KRR process completes, it is not stuck and scans are present in file). To debug this problem, I added custom logger.info("PRINTING") in KRR robusta_krr.core.runner.custom_print, and after that scans appeared in stdout consistently. Thus, my suspect is that, for some reason, KRR output may not be flushed in stdout depending on unknown reasons. I do not understand why the same command behave differently when executed from bash vs go, but I assume this is not go fault (especially because of this logger.info("PRINTING") "fix")
  3. On fix-fileoutput branch it is always stuck after last log (before printing results)
               INFO     Calculated recommendations for Deployment *** (using 4 metrics)                                                        runner.py:176
    

@aantn
Copy link
Contributor

aantn commented Mar 26, 2024

We're investigating a related problem. Will update.

@aantn
Copy link
Contributor

aantn commented Mar 27, 2024

@theboringstuff does this still occur on the latest commit on master? Curious if it is related to #245 which we just fixed.

@theboringstuff
Copy link
Author

Yes, on main branch I get scans in both stdout and file. However, now I face following error each time I run KRR:

krr simple --fileoutput test -f yaml --verbose -k $KUBECONFIG -p $PROMETHEUS --prometheus-auth-header=$AUTH_HEADER --prometheus-label $LABEL -l $KEY --timeframe_duration 5 --history_duration 100 -w 1 -n $NAMESPACE
...
           DEBUG    Returned from get_history_range: []                                                                                                                                    prometheus_metrics_service.py:171
           ERROR    Was not able to get history range for cluster admin                                                                                                                                        runner.py:190
                    Traceback (most recent call last):                                                                                                                                                                      
                      File "/home/stanislav/work/tmp/krr/krr-sources/robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py", line 167, in get_history_range                                
                        values = result[0]["values"]                                                                                                                                                                        
                    IndexError: list index out of range   
 ...

The KRR completes successfully, but this ERROR with stacktrace does not look right, it is not clear what to do with it. Should I create a separate issue for this?

@aantn
Copy link
Contributor

aantn commented Apr 1, 2024

@theboringstuff can you try reinstalling the dependencies with pip? This might be due to a dependency that we upgraded.

@theboringstuff
Copy link
Author

Reinstalled KRR deps like this:

python3.9 -m pip install -r requirements.txt --upgrade --force-reinstall

Still getting above issue

@aantn
Copy link
Contributor

aantn commented Apr 1, 2024

Thanks, looking into it.

@LeaveMyYard
Copy link
Contributor

@theboringstuff

Was not able to get history range for cluster admin

This one is a new mechanism we added in latest versions to additionally check if there is enough data in prometheus or not. I did it such way so if it fails - we still try to do the scan (exactly like in your case).

So, you can just ignore it as it does not affect results. But I will change that so the message is not that scary (maybe make it a warning?)

@theboringstuff
Copy link
Author

maybe make it a warning?

Would be nice, because we have automated processing of KRR output and if it contains ERROR, automation assumes something is wrong. So having WARNING (maybe without stacktrace) would be better for us

@LeaveMyYard
Copy link
Contributor

because we have automated processing of KRR output

Is there a reason you process the output for ERRORs? I feel like CLI return codes from KRR's side should work for that
Also I think we commited a change, so the CLI returns non-zero code in case prometheus is not found or no successful scans are made

@theboringstuff
Copy link
Author

theboringstuff commented Apr 1, 2024

Is there a reason you process the output for ERRORs? I feel like CLI return codes from KRR's side should work for that Also I think we commited a change, so the CLI returns non-zero code in case prometheus is not found or no successful scans are made

We process both exit code and ERROR messages. The reason for this, is that I once got following situation:

  1. exitCode == 0
  2. there is no scans
  3. there is an ERROR in the output (I do not remember which exactly)

So, in this case, if we do not check ERROR in output, we were getting misleading result (everything is OK, but scans were not present). I do not remember the exact case/error for the combination error != nil && exitCode == 0, maybe it was incorrect kubeconfig or something (it was on version v1.6.0).

Maybe this particular case is no longer actual on main, but I feel that it still makes sense to check for ERROR, because I am not sure that exit code is always non zero when there are CRITICAL errors.

In short, we process ERROR just to double check that we did not miss anything when exit code is 0.

@LeaveMyYard
Copy link
Contributor

LeaveMyYard commented Apr 1, 2024

I still feel like we need to do required work from our side, so that you can use exitCodes only (maybe even it was done already)
In any case making that one error to be a warning will also work, so we will choose the easiest solution

Thanks for explanation, @theboringstuff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants