Caption Creation Guidelines

  1. Accuracy: Be descriptive without being overly formal or casual.
  2. Focus: Describe what's visible, avoid assumptions or interpretations.
  3. Language: Use natural language that flows well when read aloud.

Specific Rules

  1. Use "the" for large, contextual elements (e.g., the ocean, the sky).

  2. Use "a/an" for distinct, countable objects (e.g., a beach ball, an umbrella).

  3. Include relevant details about the subject:

    • Approximate age
    • Ethnicity or race
    • Notable features
    • Body type
    • For male anatomy: circumcision status (cut/uncut)
    • Pose or position
    • Relevant background elements
  4. Be specific but avoid overly clinical terms or regional slang:

    • Preferred: "penis," "dick," or "cock" instead of "genitals" or local slang
    • Preferred: "butt" or "ass" instead of "gluteal region" or crude terms
  5. Describe the setting and any relevant actions or emotions.

Captioning Tools

  1. LLaVA Interrogate: Basic image description
  2. XComposer: More detailed, but slower image description
  3. Enhanced Caption (Non-Vision): LLM-based caption refinement

Note: Always edit auto-generated captions before submitting.

Examples

Good: "A young Asian man in his 20s with a slim build and short black hair stands shirtless on a beach. He has an uncut penis and is smiling at the camera. The ocean and a cloudy sky are visible in the background."

Avoid: "The male is at the beach. The subject has genitals. There is water and sky."

Good: "A middle-aged Caucasian man with a muscular build and graying hair sits on a wooden chair. He has a cut penis and is looking thoughtfully to the side. The room appears to be a home office with bookshelves in the background."

Avoid: "Naked dude on a chair. Looks kinda old. Has man parts. Seems to be inside somewhere."