📝 AI Startup

AI Model Basics — Your Smart Model! 🧠

0
Author
04e5cc8b-58ac-4bdc-bdee-661bbb
📅
Published
03.04.2026
⏱️
Reading time
6 min
👁️
Views
94
🌱
Level
Beginner

What is an AI Model?

An AI model is a program that learns from data and makes predictions.

Simple analogy:

A student learns to solve problems:
  Example problems → Studying → Solving new problems

An AI model learns the same way:
  Training data → Training → Predictions

An AI model = a student who learns from examples! 🎓


Components of an AI Model

1. Dataset

Training data used to train the model.

dataset = [
    {"input": "Great product!", "output": "positive"},
    {"input": "Terrible quality", "output": "negative"},
    {"input": "It's fine, nothing special", "output": "neutral"}
]

2. Training

The process by which the model learns from data.

model = AIModel()
model.train(dataset)  # The model learns!

3. Prediction

Using the trained model.

prediction = model.predict("Awesome item!")
print(prediction)  # "positive"

Key Model Metrics

1. Accuracy

The percentage of correct answers.

def calculate_accuracy(correct, total):
    """Accuracy = correct / total."""
    if total == 0:
        return 0
    return (correct / total) * 100

# Example
correct = 85  # 85 correct answers
total = 100   # out of 100 attempts
accuracy = calculate_accuracy(correct, total)
print(f"Accuracy: {accuracy}%")  # Accuracy: 85.0%

2. Loss

How much the model is getting wrong.

def calculate_loss(predictions, actuals):
    """Mean error."""
    if not predictions:
        return 0

    errors = [abs(pred - actual) for pred, actual in zip(predictions, actuals)]
    return sum(errors) / len(errors)

# Example
predictions = [0.9, 0.7, 0.3, 0.8]  # Model predictions
actuals = [1.0, 0.5, 0.0, 1.0]      # Ground truth values
loss = calculate_loss(predictions, actuals)
print(f"Loss: {loss:.2f}")  # Loss: 0.15

3. Dataset Size

How many training examples you have.

datasets = {
    "tiny": 100,        # Tiny
    "small": 1000,      # Small
    "medium": 10000,    # Medium
    "large": 100000,    # Large
    "huge": 1000000     # Huge
}

# More data = better model!

Model Lifecycle

Stage 1: Data Collection

def collect_dataset(size):
    """Collect training data."""
    dataset = []
    for i in range(size):
        example = {
            "id": i + 1,
            "text": f"Example text {i}",
            "label": "positive" if i % 2 == 0 else "negative"
        }
        dataset.append(example)
    return dataset

dataset = collect_dataset(1000)
print(f"Collected {len(dataset)} examples")

Stage 2: Training

class SimpleAIModel:
    """A simple AI model."""

    def __init__(self):
        self.dataset_size = 0
        self.accuracy = 0.5  # Initial accuracy 50%
        self.loss = 0.5      # Initial loss 50%
        self.is_trained = False

    def train(self, dataset):
        """Train the model."""
        self.dataset_size = len(dataset)

        # Accuracy grows with data (capped at 99%)
        improvement = min(0.49, self.dataset_size / 10000)
        self.accuracy = 0.5 + improvement

        # Loss decreases
        self.loss = 0.5 - improvement

        self.is_trained = True
        print(f"✅ Model trained on {self.dataset_size} examples")
        print(f"📊 Accuracy: {self.accuracy:.2%}")
        print(f"📉 Loss: {self.loss:.2f}")

    def predict(self, text):
        """Make a prediction."""
        if not self.is_trained:
            return "Model is not trained!"

        # Simplified prediction logic (real models are more complex!)
        if len(text) > 20:
            return "positive"
        else:
            return "negative"

# Usage
model = SimpleAIModel()
dataset = collect_dataset(5000)
model.train(dataset)

Stage 3: Testing

def test_model(model, test_data):
    """Evaluate the model on test data."""
    correct = 0
    total = len(test_data)

    for example in test_data:
        prediction = model.predict(example["text"])
        if prediction == example["label"]:
            correct += 1

    accuracy = (correct / total) * 100
    print(f"Test accuracy: {accuracy:.1f}%")
    return accuracy

# Test data
test_data = collect_dataset(100)
test_accuracy = test_model(model, test_data)

Stage 4: Improvement

def improve_model(model, additional_data):
    """Fine-tune the model on new data."""
    all_data = model.dataset_size + len(additional_data)
    improvement_dataset = collect_dataset(all_data)
    model.train(improvement_dataset)
    return model

# Fine-tuning
model = improve_model(model, collect_dataset(2000))

Types of AI Models

1. Classification

Assign a category to input.

model_classification = {
    "task": "Classification",
    "examples": [
        "Spam or not spam?",
        "Positive or negative review?",
        "Cat or dog in the photo?"
    ],
    "output": "Category (label)"
}

2. Regression

Predict a numeric value.

model_regression = {
    "task": "Regression",
    "examples": [
        "Apartment price",
        "Next month's sales",
        "Tomorrow's temperature"
    ],
    "output": "Number (value)"
}

3. Generation

Create something new.

model_generation = {
    "task": "Generation",
    "examples": [
        "Generate text",
        "Create an image",
        "Write code"
    ],
    "output": "New content"
}

Model Quality Metrics

Confusion Matrix

def confusion_matrix(predictions, actuals):
    """Count TP, TN, FP, FN."""
    tp = sum(1 for p, a in zip(predictions, actuals) if p == 1 and a == 1)
    tn = sum(1 for p, a in zip(predictions, actuals) if p == 0 and a == 0)
    fp = sum(1 for p, a in zip(predictions, actuals) if p == 1 and a == 0)
    fn = sum(1 for p, a in zip(predictions, actuals) if p == 0 and a == 1)

    return {
        "true_positive": tp,   # Correctly identified positives
        "true_negative": tn,   # Correctly identified negatives
        "false_positive": fp,  # False alarm
        "false_negative": fn   # Missed positives
    }

# Example
predictions = [1, 0, 1, 1, 0, 1, 0, 0]
actuals = [1, 0, 0, 1, 0, 1, 1, 0]

matrix = confusion_matrix(predictions, actuals)
print(matrix)
# {'true_positive': 3, 'true_negative': 3, 'false_positive': 1, 'false_negative': 1}

Overfitting vs Underfitting

class ModelFit:
    """Check for overfitting."""

    @staticmethod
    def check_fit(train_accuracy, test_accuracy):
        """Determine fit type."""
        if train_accuracy < 0.7 and test_accuracy < 0.7:
            return "❌ Underfitting — model is too weak"
        elif train_accuracy > 0.9 and test_accuracy < 0.7:
            return "⚠️ Overfitting — model memorized the data"
        else:
            return "✅ Good Fit — model is performing well"

# Examples
print(ModelFit.check_fit(0.95, 0.60))  # Overfitting
print(ModelFit.check_fit(0.65, 0.63))  # Underfitting
print(ModelFit.check_fit(0.88, 0.85))  # Good Fit

Practical Example: a Startup Model

class TextClassifier:
    """AI model for text classification."""

    def __init__(self, name):
        self.name = name
        self.accuracy = 0.5
        self.loss = 0.5
        self.dataset_size = 0
        self.predictions_made = 0

    def train(self, dataset):
        """Train on a dataset."""
        self.dataset_size = len(dataset)

        # Quality improves with dataset size
        if self.dataset_size < 1000:
            self.accuracy = 0.65
            self.loss = 0.35
        elif self.dataset_size < 5000:
            self.accuracy = 0.80
            self.loss = 0.20
        elif self.dataset_size < 10000:
            self.accuracy = 0.90
            self.loss = 0.10
        else:
            self.accuracy = 0.95
            self.loss = 0.05

        print(f"🎓 {self.name} trained!")
        print(f"📊 Dataset: {self.dataset_size} examples")
        print(f"✅ Accuracy: {self.accuracy:.2%}")
        print(f"📉 Loss: {self.loss:.2f}")

    def predict(self, text):
        """Classify text."""
        self.predictions_made += 1

        # Simplified logic
        positive_words = ["excellent", "good", "super", "great", "awesome"]
        negative_words = ["bad", "terrible", "awful", "failure"]

        text_lower = text.lower()

        if any(word in text_lower for word in positive_words):
            return "positive"
        elif any(word in text_lower for word in negative_words):
            return "negative"
        else:
            return "neutral"

    def get_stats(self):
        """Model statistics."""
        return {
            "model": self.name,
            "accuracy": f"{self.accuracy:.2%}",
            "dataset_size": self.dataset_size,
            "predictions": self.predictions_made
        }

# Create and train the model
model = TextClassifier("SentimentAI v1.0")
dataset = collect_dataset(7500)
model.train(dataset)

# Use the model
reviews = [
    "Works great!",
    "Terrible product",
    "It's fine, nothing special"
]

for review in reviews:
    sentiment = model.predict(review)
    print(f"'{review}' → {sentiment}")

# Statistics
print("\n📊 Model stats:")
print(model.get_stats())

Improving the Model

Ways to boost quality:

1. Add more data

def add_more_data(model, additional_size):
    """Fine-tune on more data."""
    new_data = collect_dataset(additional_size)
    total_data = model.dataset_size + len(new_data)
    all_data = collect_dataset(total_data)
    model.train(all_data)
    return model

2. Improve data quality

def clean_dataset(dataset):
    """Remove bad examples from the dataset."""
    # Remove duplicates
    unique = {d["text"]: d for d in dataset}.values()

    # Remove short examples
    quality = [d for d in unique if len(d["text"]) > 10]

    return list(quality)

3. Tune hyperparameters

model_config = {
    "learning_rate": 0.001,      # Learning rate
    "batch_size": 32,            # Batch size
    "epochs": 10,                # Number of epochs
    "dropout": 0.2               # Regularization
}

Common Mistakes

❌ Mistake 1: Too little data

# BAD
tiny_dataset = collect_dataset(50)  # Way too small!
model.train(tiny_dataset)  # Accuracy will be poor

# ✅ GOOD
good_dataset = collect_dataset(5000)  # Sufficient
model.train(good_dataset)

❌ Mistake 2: Not testing the model

# BAD
model.train(dataset)
# Deploy immediately without checking!

# ✅ GOOD
model.train(dataset)
test_accuracy = test_model(model, test_data)
if test_accuracy > 80:
    deploy_model(model)

❌ Mistake 3: Overfitting

# BAD: training for too long
model.train(dataset, epochs=1000)  # Overfitting!

# ✅ GOOD: keep it controlled
model.train(dataset, epochs=10)
if train_acc - test_acc > 0.15:
    print("⚠️ Overfitting detected!")

Summary

An AI model is:

  • 🧠 A program that learns
  • 📚 Dataset — training data
  • 🎓 Training — the learning process
  • 📊 Metrics — accuracy, loss
  • 🔮 Predictions — inference

Key metrics:

model_metrics = {
    "accuracy": 0.88,       # 88% accuracy
    "loss": 0.12,           # 12% loss
    "dataset_size": 10000,  # 10K examples
    "predictions": 5000     # 5K predictions
}

Lifecycle:

Data Collection → Training → Testing → Improvement → Deployment

What’s Next?

Now you know the basics of AI models! 🎉

Next topics:
- Model metrics — precision, recall, F1-score
- Model types — classification, regression, generation
- Training — optimization, regularization

Build your own AI model and conquer the market! 🤖💡

Your reaction to the article

💬 Comments (0)

🔐 Sign in to leave a comment
🚪 Login
💭

No comments yet

Be the first to share your opinion about this article!

🔗 Similar

Similar articles

Continue learning with these materials

📝

Anthropic SDK: Getting Started with the Claude API

Anthropic Python SDK is the official library for working with Claude. It hides the complexity...

📅 04.06.2026 👁️ 18
📝

Docker Compose: Multi-Container Applications

Docker Compose is a tool for running multiple related containers as a single application. The...

📅 08.05.2026 👁️ 50
📝

Dockerfile: Building Your Own Images

Dockerfile — a text file with instructions for building an image. Each instruction becomes a...

📅 08.05.2026 👁️ 55

Did you like the article?

Subscribe to our updates and receive new articles first. Grow with PyLand!