Troubleshooting

This guide helps resolve common issues when using U-Probe.

Installation Issues

Command not found: uprobe

Problem: After installation, the uprobe command is not recognized.

Solutions:

Check if installed correctly:

bash

   pip list | grep uprobe
   python -c "import uprobe; print(uprobe.__version__)"

Try using Python module syntax:

bash

   python -m uprobe.core.cli --help

Check PATH (for --user installs):

bash

   # Add to ~/.bashrc or ~/.zshrc
   export PATH="$HOME/.local/bin:$PATH"

Reinstall in a virtual environment:

bash

   python -m venv uprobe_env
   source uprobe_env/bin/activate
   pip install uprobe

ImportError: No module named 'uprobe'

Problem: Python cannot find the uprobe module.

Solutions:

Verify installation:

bash

   pip show uprobe

Check Python environment:

bash

   which python
   which pip
   # Ensure both point to the same environment

Reinstall:

bash

   pip uninstall uprobe
   pip install uprobe

Missing dependencies errors

Problem: Errors about missing packages like pandas, click, etc.

Solutions:

Install all requirements:

bash

   pip install -r requirements.txt

Update pip and try again:

bash

   pip install --upgrade pip
   pip install uprobe

For development installs:

bash

   pip install -e ".[dev]"

Configuration Issues

FileNotFoundError: [Errno 2] No such file or directory

Problem: U-Probe cannot find specified files.

Solutions:

Use absolute paths:

yaml

   # Instead of relative paths
   fasta: "genome.fa"
   # Use absolute paths  
   fasta: "/full/path/to/genome.fa"

Check file permissions:

bash

   ls -la /path/to/genome.fa
   # Ensure files are readable

Verify file existence:

bash

   file /path/to/genome.fa
   head -n 5 /path/to/genome.fa

Target validation failed

Problem: Error message "Invalid targets found" or no targets pass validation.

Solutions:

Check gene names in GTF:

bash

   # Search for your gene in GTF
   grep -i "GAPDH" /path/to/annotation.gtf
   
   # Check available gene names
   awk '$3=="gene"' /path/to/annotation.gtf | \
   grep -o 'gene_name "[^"]*"' | sort | uniq | head -20

Try different gene identifiers:

yaml

   targets:
     - "GAPDH"           # Gene symbol
     - "ENSG00000111640" # Ensembl ID
     - "2597"            # Entrez ID

Use continue-invalid flag for testing:

bash

   uprobe validate-targets -p protocol.yaml -g genomes.yaml --continue-invalid

Check GTF format:

bash

   # GTF should have these columns:
   # seqname source feature start end score strand frame attribute
   head -n 5 /path/to/annotation.gtf

Invalid YAML syntax

Problem: YAML parsing errors.

Solutions:

Check indentation (use spaces, not tabs):

yaml

   # Correct
   probes:
     main_probe:          # 2 spaces
       template: "{seq}"  # 4 spaces
   
   # Wrong (tabs or inconsistent spacing)
   probes:
   	main_probe:        # tab character
      template: "{seq}"   # 3 spaces

Validate YAML syntax:

bash

   python -c "import yaml; yaml.safe_load(open('protocol.yaml'))"

Quote strings with special characters:

yaml

   # Quote expressions and conditions
   expr: "rc(target_region[0:20])"
   condition: "gc_content >= 0.4 & gc_content <= 0.6"

Runtime Issues

No target sequences generated

Problem: The generate-targets step produces an empty result.

Solutions:

Check extraction parameters:

yaml

   extracts:
     target_region:
       source: "exon"  # Try "gene" if exons are too short
       length: 50      # Reduce if regions are smaller
       overlap: 10     # Reduce overlap

Verify targets exist:

bash

   uprobe --verbose validate-targets -p protocol.yaml -g genomes.yaml

Check for gene annotation issues:

bash

   # Look for your gene in GTF
   grep "GAPDH" /path/to/annotation.gtf | head -5

No probes constructed

Problem: The construct-probes step fails or produces no output.

Solutions:

Check probe expressions:

yaml

   probes:
     test_probe:
       template: "{simple_part}"
       parts:
         simple_part:
           length: 20
           expr: "target_region[0:20]"  # Simple expression

Verify encoding mappings:

yaml

   # Ensure all target genes have encoding entries
   encoding:
     GAPDH:  # Must match target name exactly
       BC1: "ACGTACGTACGT"

Test with minimal probe:

yaml

   probes:
     minimal:
       expr: "target_region[0:25]"

All probes filtered out

Problem: Post-processing removes all probes.

Solutions:

Use --raw flag to see unfiltered probes:

bash

   uprobe run -p protocol.yaml -g genomes.yaml --raw

Relax filtering conditions:

yaml

   post_process:
     filters:
       gc_content:
         condition: "gc_content >= 0.2 & gc_content <= 0.8"  # Very relaxed

Check attribute calculations:

yaml

   # Remove problematic attributes temporarily
   attributes:
     basic_gc:
       target: main_probe
       type: gc_content
     # Comment out complex attributes:
     # off_targets: ...

Examine raw results:

python

   import pandas as pd
   df = pd.read_csv('results/experiment_raw.csv')
   print(df.describe())  # Check attribute distributions
   print(df[df['gc_content'].isna()])  # Find failed calculations

Performance Issues

Slow execution

Problem: U-Probe runs very slowly.

Solutions:

Increase thread count:

bash

   uprobe run -p protocol.yaml -g genomes.yaml -t 16

Use faster extraction:

yaml

   extracts:
     target_region:
       source: "exon"  # Faster than "gene"
       length: 100     # Shorter regions

Reduce expensive attributes:

yaml

   attributes:
     # Keep fast attributes
     gc_content:
       target: main_probe
       type: gc_content
     # Remove slow ones temporarily:
     # fold_score: ...
     # kmer_count: ...

Process in batches:

bash

   # Split large target lists
   uprobe run -p small_batch.yaml -g genomes.yaml

Memory issues

Problem: Out of memory errors or system becomes unresponsive.

Solutions:

Process smaller batches:

yaml

   targets:
     - "GAPDH"
     - "ACTB"
     # Process 5-10 genes at a time for large genomes

Reduce sequence length:

yaml

   extracts:
     target_region:
       length: 80   # Shorter sequences use less memory
       overlap: 15

Skip memory-intensive attributes:

yaml

   # Avoid these for large datasets:
   # - n_mapped_genes with blast
   # - kmer_count
   # - complex fold_score calculations

Index building fails

Problem: Genome index building fails or crashes.

Solutions:

Check available disk space:

bash

   df -h /path/to/genome/directory

Verify genome file integrity:

bash

   file /path/to/genome.fa
   head -n 10 /path/to/genome.fa
   tail -n 10 /path/to/genome.fa

Build indices manually:

bash

   # Bowtie2
   bowtie2-build /path/to/genome.fa /path/to/indices/genome
   
   # BLAST
   makeblastdb -in /path/to/genome.fa -dbtype nucl -out /path/to/indices/genome

Use pre-built indices:

yaml

   # Point to existing indices
   human_hg38:
     fasta: "/data/hg38.fa"
     gtf: "/data/hg38.gtf"
     out: "/data/existing_indices"  # Pre-built indices location

Attribute Calculation Issues

Melting temperature calculation fails

Problem: Tm calculation produces NaN values or errors.

Solutions:

Check sequence validity:

python

   # Sequences should only contain ATCG
   import re
   def check_sequence(seq):
       return bool(re.match('^[ATCG]*$', seq))

Handle short sequences:

yaml

   # Ensure minimum sequence length
   probes:
     main_probe:
       parts:
         binding:
           length: 15  # Minimum for reliable Tm calculation

Off-target calculation fails

Problem: Alignment-based attributes fail.

Solutions:

Verify indices exist:

bash

   ls -la /path/to/indices/
   # Should contain .bt2 files for bowtie2

Test aligner manually:

bash

   # Test bowtie2
   echo "ATCGATCGATCGATCG" | bowtie2 -x /path/to/indices/genome -

Use alternative aligner:

yaml

   attributes:
     off_targets:
       target: main_probe
       type: n_mapped_genes
       aligner: blast  # Try blast if bowtie2 fails

K-mer counting fails

Problem: kmer_count attributes produce errors.

Solutions:

Check Jellyfish database:

bash

   jellyfish info genome.jf

Build Jellyfish database:

bash

   jellyfish count -m 15 -s 1000000000 -t 8 -o genome.jf genome.fa

Use alternative complexity measures:

yaml

   # Instead of kmer_count, use:
   attributes:
     sequence_complexity:
       target: main_probe
       type: complexity_score

Data Format Issues

Unexpected output format

Problem: Output CSV has unexpected columns or values.

Solutions:

Check probe names match:

yaml

   # Probe names become column names
   probes:
     my_probe:  # Creates column 'my_probe'
       template: "{seq}"

Verify attribute names:

yaml

   attributes:
     probe_gc:     # Creates column 'probe_gc'
       target: my_probe
       type: gc_content

Examine raw output:

bash

   uprobe run -p protocol.yaml -g genomes.yaml --raw
   # Check _raw.csv file for all calculated values

Missing sequences in output

Problem: Some expected probes are missing from results.

Solutions:

Check filtering criteria:

yaml

   # Very permissive filters for debugging
   post_process:
     filters:
       anything_goes:
         condition: "True"  # Passes everything

Look for errors in logs:

bash

   uprobe --verbose run -p protocol.yaml -g genomes.yaml 2>&1 | tee log.txt

Check intermediate files:

bash

   ls -la results/
   wc -l results/*.csv  # Count lines in each file

Getting Help

Check Logs

Always run with verbose output for troubleshooting:

bash

uprobe --verbose run -p protocol.yaml -g genomes.yaml 2>&1 | tee uprobe.log

Minimal Test Case

Create a minimal test to isolate issues:

yaml

# minimal_test.yaml
name: "minimal_test"
genome: "human_hg38"
targets: ["GAPDH"]  # Just one target

extracts:
  target_region:
    source: "exon"
    length: 50
    overlap: 10

probes:
  simple:
    expr: "target_region[0:20]"

# No attributes or filters initially

Report Issues

When reporting issues, include:

U-Probe version: uprobe version
Full error message and traceback
Configuration files (anonymized)
System information: OS, Python version
Steps to reproduce

Where to Get Help

Documentation: Check this documentation first
GitHub Issues: Report bugs
GitHub Discussions: Ask questions
Examples: Review working examples in the repository

Common Error Messages

Error Message	Solution
"Genome 'X' not found"	Check genome name matches genomes.yaml key
"No targets specified"	Add targets list to protocol.yaml
"Invalid expression: X"	Check probe expression syntax
"Attribute calculation failed"	Verify required files and indices exist
"No data to concatenate"	Check that previous steps generated output
"YAML parsing error"	Check indentation and syntax
"Permission denied"	Check file permissions and disk space
"Index not found"	Run build-index command first

Prevention Tips

Start simple: Begin with basic configurations and add complexity gradually
Validate early: Use validate-targets before full runs
Test with subsets: Use small target lists for initial testing
Use version control: Track configuration changes
Document decisions: Comment your configuration files
Regular backups: Keep backups of working configurations

Next Steps

If you're still having issues:

Review the examples for working configurations
Check the configuration guide for detailed option descriptions
Ask for help on GitHub Discussions

Troubleshooting ​

Installation Issues ​

Command not found: uprobe ​

ImportError: No module named 'uprobe' ​

Missing dependencies errors ​

Configuration Issues ​

FileNotFoundError: [Errno 2] No such file or directory ​

Target validation failed ​

Invalid YAML syntax ​

Runtime Issues ​

No target sequences generated ​

No probes constructed ​

All probes filtered out ​

Performance Issues ​

Slow execution ​

Memory issues ​

Index building fails ​

Attribute Calculation Issues ​

Melting temperature calculation fails ​

Off-target calculation fails ​

K-mer counting fails ​

Data Format Issues ​

Unexpected output format ​

Missing sequences in output ​

Getting Help ​

Check Logs ​

Minimal Test Case ​

Report Issues ​

Where to Get Help ​

Common Error Messages ​

Prevention Tips ​

Next Steps ​

Troubleshooting

Installation Issues

Command not found: uprobe

ImportError: No module named 'uprobe'

Missing dependencies errors

Configuration Issues

FileNotFoundError: [Errno 2] No such file or directory

Target validation failed

Invalid YAML syntax

Runtime Issues

No target sequences generated

No probes constructed

All probes filtered out

Performance Issues

Slow execution

Memory issues

Index building fails

Attribute Calculation Issues

Melting temperature calculation fails

Off-target calculation fails

K-mer counting fails

Data Format Issues

Unexpected output format

Missing sequences in output

Getting Help

Check Logs

Minimal Test Case

Report Issues

Where to Get Help

Common Error Messages

Prevention Tips

Next Steps