package logmower import ( "context" "fmt" "log" "os" "os/signal" "path/filepath" "sync" "time" ms "git.k-space.ee/k-space/logmower-shipper/pkg/mongoStruct" "github.com/fsnotify/fsnotify" prom "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" "github.com/urfave/cli/v2" "k8s.io/apimachinery/pkg/util/wait" ) const DatabaseCommandTimeout = 10 * time.Second // wrapper to force copying before use func defaultBackoff() wait.Backoff { return wait.Backoff{ Duration: 2 * time.Second, Factor: 1.5, Jitter: 0.1, Cap: 30 * time.Second, } } func mongoTimeoutCtx(ctx context.Context) context.Context { ctx, _ = context.WithTimeout(ctx, DatabaseCommandTimeout) //nolint:lostcancel (cancelled by mongo, should be bug on them //TODO) return ctx } var App = &cli.App{ Name: "logmower-shipper", Version: "1.0.0", Authors: []*cli.Author{{Name: "jtagcat"}}, Description: "Collect and ship kubernetes logs", // Usage: "rubykana ", // TODO: #2: yaml Flags: []cli.Flag{ &cli.BoolFlag{Name: "dry-run", Usage: "Do not write to database"}, // TODO: &cli.StringFlag{Name: "log-directory", Usage: "Directory to watch for logs", Value: "/var/log/containers"}, &cli.IntFlag{Name: "max-record-size", Value: 128 * 1024, Usage: "Maximum record size in bytes"}, // TODO: &cli.BoolFlag{Name: "normalize-log-level", Usage: "Normalize log.level values to Syslog defined keywords"}, // TODO: // &cli.BoolFlag{Name: "parse-json"}, //TODO: &cli.StringFlag{Category: "k8s metadata", Name: "pod-namespace", EnvVars: []string{"KUBE_POD_NAMESPACE"}}, // TODO: &cli.StringFlag{Category: "k8s metadata", Name: "node-name", EnvVars: []string{"KUBE_NODE_NAME"}, Required: true}, &cli.StringFlag{Category: "secrets", Name: "mongo-uri", EnvVars: []string{"MONGO_URI"}, Usage: "mongodb://foo:bar@host:27017/database", Required: true}, }, Before: func(ctx *cli.Context) error { if ctx.Int("max-record-size") < 1 { return fmt.Errorf("max-record-size must be postivie") } return nil }, Action: func(ctx *cli.Context) error { var ( promWatcherOnline = promauto.NewGauge(prom.GaugeOpts{ Namespace: PrometheusPrefix, Subsystem: "watcher", Name: "online", Help: "1 if initialized, and directory watcher has been engaged successfully", }) promWatcherErr = promauto.NewCounter(prom.CounterOpts{ Namespace: PrometheusPrefix, Subsystem: "watcher", Name: "errors", Help: "Error in logmower watching log files", }) promWatcherFilesStarted = promauto.NewCounter(prom.CounterOpts{ Namespace: PrometheusPrefix, // Subsystem: "watcher", Name: "log_file", // "discovered_logfiles", Help: "Number of tracked log files", }) promWatcherFilesSkipped = promauto.NewCounter(prom.CounterOpts{ Namespace: PrometheusPrefix, // Subsystem: "watcher", Name: "invalid_filename", // "skipped_files", Help: "Number of files in log directory skipped due to unexpected filename", }) promWatcherEvents = promauto.NewCounter(prom.CounterOpts{ Namespace: PrometheusPrefix, // Subsystem: "watcher", Name: "inotify_event", // "events", Help: "Number of events while watchng (includes initial create events for existing file discovery)", }) ) ctx.Context, _ = signal.NotifyContext(ctx.Context, os.Interrupt) // TODO: test var wg sync.WaitGroup log.Printf("%s %s starting", ctx.App.Name, ctx.App.Version) db, err := initDatabase(ctx.Context, ctx.String("mongo-uri")) if err != nil { return fmt.Errorf("initializing database connection: %w", err) } var hostInfo ms.HostInfo if err := hostInfo.Populate(ctx.String("node-name")); err != nil { return fmt.Errorf("populating host info: %w", err) } watcher, err := fsnotify.NewWatcher() if err != nil { return fmt.Errorf("initializing log directory watcher: %w", err) } defer watcher.Close() wg.Add(1) go func() { defer wg.Done() for { select { case <-ctx.Context.Done(): return case event, ok := <-watcher.Events: if !ok { return } promWatcherEvents.Add(1) if event.Op != fsnotify.Create { continue } // TODO: #1: || if not in filterset kubeInfo, ok := ms.ParseLogName(event.Name) if !ok { promWatcherFilesSkipped.Add(1) log.Printf("skipped %q: filename not parsable in kubernetes log format", filepath.Base(event.Name)) continue } promWatcherFilesStarted.Add(1) wg.Add(1) go func() { file := file{ File: ms.File{ Host: &hostInfo, KubeInfo: kubeInfo, Path: event.Name, }, metricsName: filepath.Base(event.Name), } file.Process(ctx.Context, db, ctx.Int("max-record-size")) wg.Done() }() case err, ok := <-watcher.Errors: if !ok { return } promWatcherErr.Add(1) log.Printf("watching for new logs: %e", err) } } }() logDir := ctx.String("log-directory") // simulate create events to pick up files already created if err := simulateInitialCreates(logDir, watcher.Events); err != nil { return fmt.Errorf("listing log directory %q: %w", logDir, err) } if err := watcher.Add(logDir); err != nil { return fmt.Errorf("watching for new logs in %q: %w", logDir, err) } promWatcherOnline.Set(1) // waiting indefinitely for interrupt wg.Wait() // wait for watch and file processors to cleanup return ctx.Err() }, } func simulateInitialCreates(dirName string, eventChan chan<- fsnotify.Event) error { dir, err := os.ReadDir(dirName) if err != nil { return err } for _, file := range dir { eventChan <- fsnotify.Event{ Name: filepath.Join(dirName, file.Name()), Op: fsnotify.Create, } } return nil }