Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_tail: possibly collects duplicated logs in rotate_wait #4243

Open
daipom opened this issue Jul 17, 2023 · 1 comment
Open

in_tail: possibly collects duplicated logs in rotate_wait #4243

daipom opened this issue Jul 17, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@daipom
Copy link
Contributor

daipom commented Jul 17, 2023

Describe the bug

I think the major log duplication problem about follow_inodes is fixed in #4237.
However, there is still a possibility of duplication (#4237 (comment)).

in_tail possibly collects duplicated logs in rotate_wait.
Mulitple TailWatchers can exist in rotate_wait, so they possibly collect the same target in that interval.

We need the mechanism not to add duplicated TailWathcer while detaching it.

I also wrote the code comment as follows in #4237:

# https://github.com/fluent/fluentd/pull/4237#issuecomment-1633358632
# Because of this problem, log duplication can occur during `rotate_wait`.
# Need to set `rotate_wait 0` for a workaround.
# Duplication will occur if `refresh_watcher` is called during the `rotate_wait`.
# In that case, `refresh_watcher` will add the new TailWatcher to tail the same target,
# and it causes the log duplication.
# (Other `detach_watcher_after_rotate_wait` may have the same problem.
# We need the mechanism not to add duplicated TailWathcer with detaching TailWatcher.)
detach_watcher_after_rotate_wait(tail_watcher, pe.read_inode)

This is not something like that all logs in one file are entirely duplicated.
(I believe such problems were resolved in #4237)


Detail explanation (from #4237 (comment))

The scenario: rotate -> update_watcher -> refresh_watcher -> rotate_wait elapsed

  • Initial state
path-1: inode-b, TailWatcher-1 
path-2: inode-a, TailWathcer-2
  • Rotate happens
path-1: inode-c, TailWatcher-1-new
path-2: inode-b, TailWathcer-2-new, TailWatcher-1-detaching
path-3: inode-a, TailWatcher-2-detaching
  • refresh_watcher
path-1: inode-c, TailWatcher-1-new
path-2: inode-b, TailWathcer-2-new, TailWatcher-1-detaching
path-3: inode-a, TailWatcher-2-detaching, TailWatcher-3
  • rotate_wait elapsed
path-1: inode-c, TailWatcher-1-new
path-2: inode-b, TailWathcer-2-new
path-3: inode-a, TailWatcher-3

As above, rotate_wait causes the situation that multiple TailWatchers exist for the same target temporarily.
This causes log duplication.
I have confirmed this on v1.16.1.

To Reproduce

From #4237 (comment).

Use the config below.

  • Start to tail the following file
test.log
  • Rotate
test.log
test.log.1
  • Append a log to test.log.1.
    • When refresh_interval NOT elapsed and rotate_wait NOT elapsed
      • No duplication
    • When refresh_interval elapsed and rotate_wait NOT elapsed
      • DUPLICATION
    • When refresh_interval elapsed and rotate_wait elapsed
      • No duplication

Expected behavior

Log duplication does not occur.

Your Environment

- Fluentd version: 1.16.1, 1.16.2
- Operating system: Ubuntu 20.04.6 LTS
- Kernel version: 5.15.0-71-generic

Your Configuration

<source>
  @type tail
  tag test
  path /path/to/test.log*
  pos_file /test/fluentd/pos/pos
  follow_inodes true
  refresh_interval 5s
  enable_stat_watcher false # To ensure that TailWathcer recognizes rotation
  rotate_wait 30s
  <parse>
    @type none
  </parse>
</source>

<match test.**>
  @type stdout
</match>

Your Error Log

No error.

Additional context

We can set rotate_wait 0 for a workaround.

@daipom daipom added waiting-for-triage bug Something isn't working and removed waiting-for-triage labels Jul 17, 2023
@daipom
Copy link
Contributor Author

daipom commented Jul 17, 2023

I think this will never be a problem when using follow_inodes false because we can't use wildcards for path setting with follow_inodes false. (It is the specification that log duplication can occur if using wildcards with follow_inodes false)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: To-Do
Development

No branches or pull requests

1 participant