Class DistributionStrategy

java.lang.Object
io.aiven.kafka.connect.common.source.task.DistributionStrategy

public final class DistributionStrategy extends Object
An DistributionStrategy provides a mechanism to share the work of processing records from objects (or files) into tasks, which are subsequently processed (potentially in parallel) by Kafka Connect workers.

The number of objects in cloud storage can be very high, selecting a distribution strategy allows the connector to know how to distribute the load across Connector tasks and in some cases using an appropriate strategy can also decide on maintaining a level of ordering between messages as well.

  • Field Details

  • Constructor Details

  • Method Details

    • getTaskFor

      public int getTaskFor(Context<?> ctx)
      Retrieve the taskId that this object should be processed by. Any single object will be assigned deterministically to a single taskId, that will be always return the same taskId output given the same context is used.
      Parameters:
      ctx - This is the context which contains optional values for the partition, topic and storage key name
      Returns:
      the taskId which this particular task should be assigned to.
    • configureDistributionStrategy

      public void configureDistributionStrategy(int maxTasks)
      When a connector receives a reconfigure event this method should be called to ensure that the distribution strategy is updated correctly.
      Parameters:
      maxTasks - The maximum number of tasks created for the Connector