🔧 Django Management Commands for Contact & Domain Intelligence (with Gensim)

Today we built a powerful suite of Django management commands for improving and enriching Contact and Domain models in a real estate platform using Django 1.8 and Python 2.7 — all designed to work with legacy systems, while still leveraging smart NLP techniques like text summarization.

🛠️ Overview of Management Commands

1. update_contact_offer_counts

Purpose: Updates the count field of each Contact with the number of related Offer objects.

python manage.py update_contact_offer_counts

2. update_domain_contact_counts

Purpose: Updates the contact_count field in each Domain by counting how many Contact objects are assigned to it.

python manage.py update_domain_contact_counts

3. update_domain_ad_counts

Purpose: Sums up all Contact.count values for contacts linked to a Domain, and saves that total in the Domain.ad_count field.

python manage.py update_domain_ad_counts

4. show_contacts_with_multiple_offers_and_no_domain

Purpose: Lists all Contact objects that:

  • Have more than one offer (count > 1)
  • Have a non-empty website
  • Do not yet have a Domain assigned
python manage.py show_contacts_with_multiple_offers_and_no_domain

5. assign_domains_to_contacts

Purpose: For every Domain, finds Contact objects whose website URL contains the domain’s URL, and assigns that Domain if not already assigned.

python manage.py assign_domains_to_contacts

6. copy_contact_logos_to_domains

Purpose: For each Domain that has no logo, finds a related Contact that does, and copies the logo.

python manage.py copy_contact_logos_to_domains

7. generate_summaries_with_gensim

Purpose: Generates a short summary from each Domain.plain_rewrite using Gensim’s summarize() function, and stores it in the description field.

python manage.py generate_summaries_with_gensim

8. generate_rewrite_and_summary

Purpose: First strips html_rewrite into plain text (if plain_rewrite is empty). Then generates a summary using Gensim and saves it in description.

python manage.py generate_rewrite_and_summary

🧠 Bonus: What Can Gensim Do With Text?

Gensim is a powerful NLP toolkit focused on semantic modeling, topic discovery, and similarity analysis — particularly useful when working with large sets of unstructured text like contact descriptions, real estate listings, or scraped HTML.

Feature Tool/Method Use Case
Summarization summarize() Auto-snippets, TL;DRs, meta descriptions
Keyword Extraction keywords() Auto-tagging, search filtering, highlights
Topic Modeling LdaModel, LsiModel Discover themes in ads or descriptions
Similarity Search MatrixSimilarity Detect duplicates, recommend similar items
Word Similarity Word2Vec, FastText Semantic search, user intent detection
Document Embedding Doc2Vec Content recommendation, ML clustering
TF-IDF Modeling TfidfModel Identify unique or weighted keywords

Pro tip: Even in legacy Python 2.7 setups, Gensim 3.x remains a reliable and flexible choice for NLP-based processing without requiring heavy ML infrastructure.

🚀 Ready to Expand

With these tools in place, you now have:

  • Clean, structured data (count, ad_count, description)
  • Enriched content from HTML
  • NLP summaries, keywords, and potential for auto-tagging

This lays the foundation for smart features like:

  • Related listings
  • Contact deduplication
  • AI-assisted content suggestions
  • Real-time domain health dashboards

Let me know if you’d like to expand this setup with TF-IDF, clustering, auto-tagging, or multi-language summaries next!

Comments