AutoGPT/docs/content/platform/blocks/update/sampling.md at update-block-docs

github/AutoGPT

Fork 0

mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-01-14 01:28:11 -05:00

Files

Bently 9572415b74 Update block docs for: sampling.md

2025-01-13 10:23:46 +00:00

1.9 KiB

Raw Permalink Blame History

Data Sampling

What it is

A versatile data sampling tool that can select specific items from a dataset using various selection methods.

What it does

Takes a collection of data items and returns a smaller subset based on user-defined criteria and sampling methods. It can work with different types of data collections and offers multiple ways to choose which items to include in the sample.

How it works

The system looks at your data collection and selects items based on your chosen sampling method. It can pick items:

Completely randomly
At regular intervals
From specific groups proportionally
Based on importance weights
In clusters or groups
From the beginning or end
Using reservoir sampling for streaming data

Inputs

Data: The collection of items you want to sample from
Sample Size: How many items you want in your final selection
Sampling Method: How you want to choose the items (random, systematic, top, bottom, stratified, weighted, reservoir, or cluster)
Accumulate: Whether to collect data over time before sampling
Random Seed: A number to ensure you get the same results each time (optional)
Stratify Key: The category to use when ensuring balanced group representation
Weight Key: The value to use when considering item importance
Cluster Key: The group identifier for cluster-based sampling

Outputs

Sampled Data: The selected items from your dataset
Sample Indices: The positions of the selected items in the original dataset

Possible use cases

Quality control in manufacturing: Randomly selecting products for inspection
Market research: Selecting a representative group of customers to survey
Data analysis: Creating balanced training datasets for machine learning
Scientific research: Selecting specimens for detailed analysis
Social studies: Choosing participants for a study while maintaining demographic balance

1.9 KiB Raw Permalink Blame History