How We Review AI Companion Apps

Most AI companion review sites score apps on chat quality and features, then slap a number on it. Privacy gets a bullet point. Safety gets ignored. You’re left wondering: did they actually read the privacy policy, or just test the chatbot for an afternoon?

We built CompanionWise because people choosing an AI companion deserve better than that. You should know exactly how we reach our conclusions, what evidence we rely on, and whether money influences our ratings. (It doesn’t.) This page walks through our full review process, the five areas we evaluate, and the independence rules we follow for every single review we publish.

TL;DR: Every CompanionWise review follows the same process: we evaluate conversation quality, features, pricing, user experience, and safety. Safety ratings come from a separate 23-dimension scoring system with published evidence standards. No app can pay for a better score. We publish our full safety methodology and editorial policy so you can verify our standards yourself.

What Do We Evaluate in Every Review?

Every review on CompanionWise covers five areas. We don’t just test whether the AI can hold a conversation. We look at the full picture: how the app works, what it costs, how it treats your data, and whether it’s safe to use.

Here’s what we assess:

  • Conversation quality. How natural does the AI sound? Does it remember what you told it last week? Can it handle emotional topics with care, or does it give generic responses? We test casual chat, emotional support scenarios, and creative interactions across multiple sessions.
  • Features and capabilities. What can you actually do with the app? We look at voice chat, image generation, character customization, memory systems, and any unique features the app offers. We test both free and paid tiers so we can tell you exactly what you get at each price point.
  • Pricing and value. We document every subscription tier, what’s included, and how the pricing compares to similar apps. If an app buries upsells or makes cancellation difficult, we call that out.
  • User experience. Is the app easy to set up? Does it work well on your phone? We evaluate onboarding, interface design, performance, and any friction points that make the app frustrating to use.
  • Safety integration. This is where we differ from other review sites. Every review includes the app’s CompanionWise Safety Rating, which is scored through a completely separate process. Safety isn’t a footnote in our reviews. It’s a core pillar.

CompanionWise reviews assess conversation quality, features, pricing, UX, and safety as five separate dimensions. Each dimension gets its own analysis rather than being collapsed into a single number. That means you can find an app with great chat quality but poor privacy practices, and you’ll see both clearly in our review. Most competitor sites use 8 to 15 scoring criteria focused almost entirely on features. We add an entire safety layer on top, scored independently through our safety rating methodology. For a practical example of how this works, see our best AI companions for students ranking, where safety weighting reshuffles the order from what a features-only list would produce.

How Do We Gather Evidence?

Our v1 evaluation is evidence-based: we analyze official documentation, regulatory records, and user review patterns rather than relying on marketing materials or press screenshots. Here’s what that looks like in practice:

  • Privacy policy review. We read the full privacy policy, not the summary. We check what data gets collected, how long it’s retained, whether it gets shared with third parties, and what happens to your data if you delete your account.
  • Terms of service analysis. We review content ownership clauses, termination policies, arbitration requirements, and any clauses that could surprise users.
  • App store review patterns. We analyze user reviews on Google Play and the Apple App Store, pulling 200 or more reviews per store within a 12-month window. We look for recurring complaints about billing, data handling, or safety issues.
  • Regulatory and incident search. We run mandatory searches for fines, FTC complaints, GDPR violations, government bans, and safety incidents for every app we rate.
  • Official safety documentation. When apps publish dedicated trust or safety pages, we document their crisis response mechanisms, content moderation descriptions, and parental controls.

We hold our published claims to a strict 4-tier evidence hierarchy. Any claim using strong language like “unsafe” or “exploitative” requires Tier 1 evidence (official policy quotes, regulatory actions, or direct testing when available). Our Muah AI review is one example where regulatory findings and a verified data breach shaped the safety score, and our PepHop AI review is another where a fictional jurisdiction in the privacy policy raised fundamental enforceability questions or Tier 2 evidence (reporting from outlets like the New York Times, BBC, or Wired, or findings from peer-reviewed research). Single social media complaints and unverified screenshots aren’t enough. We don’t publish claims we can’t back up. You can read the full framework in our evidence standards.

How Do Safety Scores Fit Into Reviews?

Safety scores and product reviews are two different things at CompanionWise. We score them through entirely separate processes, using different criteria and different evidence, so that a great chat experience never masks a weak privacy policy.

Our safety ratings use a 23 sub-dimension scoring framework that feeds into six public-facing dimensions:

  1. Data Privacy. What data gets collected, retained, and sold. How clear and complete the privacy policy is.
  2. Emotional Safety. Whether the app uses manipulative tactics to keep users engaged, encourages unhealthy dependency, is transparent about being an AI, and makes accurate claims about therapeutic benefits.
  3. Age Appropriateness. How strong the age verification is, what content guardrails exist for younger users, and whether minors can easily access adult content.
  4. Content Safety. Crisis response and suicide prevention protocols, sexual content guardrails, violence filtering, and whether the AI respects user boundaries.
  5. Transparency. Terms of service fairness, safety incident reporting, monetization ethics, regulatory compliance track record, and responsiveness to user feedback.
  6. User Control. How easy it is to delete your account, export your data, cancel your subscription, and remove conversation history.

Each dimension is scored and displayed on a 0-to-100 scale. The overall Safety Rating is the weighted average across all six (Content Safety, Emotional Safety, and Data Privacy each carry 20%; Age Appropriateness and Transparency carry 15%; User Control carries 10%), displayed on a visual scorecard alongside the last-reviewed date. When a score changes, we show the previous value and explain why it moved.

Why separate safety from product quality? Because an app can have excellent conversation features and terrible privacy practices. OurDream AI’s D-rated safety page and PolyBuzz’s F-rated safety page are examples of how we surface these risks independently. Blending those into one score hides the risk. CompanionWise safety ratings use 23 sub-dimensions across 6 public categories, scored independently of product reviews. The safety score is finalized before any commercial discussion with that app’s developer. This way, sponsorship conversations never influence safety outcomes. Our full scoring methodology, including how each sub-dimension is weighted and what evidence drives each score, is published at How We Rate.

What Are Our Independence Standards?

No app can pay for a better score. That’s the short version. Here are the rules we enforce:

  • Safety ratings are never influenced by sponsorship. A safety score is finalized before any commercial discussion with that app’s developer.
  • Sponsored placements don’t appear on safety pages. You’ll never see a “Sponsored” badge next to a safety rating or on a Safety Index page.
  • Low-scoring apps can’t buy visibility. Apps scoring below 38 out of 100 on our Safety Index are ineligible for any sponsored placement on CompanionWise. Apps below 50 can’t receive “Editor’s Pick” or “Top Pick” badges, regardless of any commercial arrangement.
  • Sponsored content is always labeled. If a placement is sponsored, you’ll see a clear “Sponsored” badge on the page and a disclosure statement inline.
  • Affiliate links don’t affect ratings. Whether or not an app has an affiliate program has zero bearing on its review score or safety rating.
  • Sponsors pay for audience access, not editorial outcomes. Any sponsor who attempts to influence a rating gets removed from our sponsorship program immediately.

Apps scoring below 38 out of 100 on the CompanionWise Safety Index cannot purchase sponsored placements anywhere on this site. A site that publishes safety ratings can’t also sell ad space to apps it considers unsafe. These aren’t guidelines we hope to follow. They’re hard rules written into our editorial independence policy.

How Do We Keep Reviews Current?

AI companion apps change fast. New features ship, pricing changes, privacy policies get updated, and sometimes regulators step in. A review from six months ago might not reflect what you’d experience today. So we don’t just publish and move on.

We follow a tiered refresh schedule:

  • Tier 1 (top 5 apps by traffic). Checked monthly. Right now that includes Replika, Character.ai, Nomi, Kindroid, Candy AI, and Talkie AI.
  • Tier 2 (apps 6 through 15). Reviewed quarterly, during the first week of each quarter.
  • Tier 3 (long-tail and smaller apps). Reviewed semi-annually, in January and July. Recent additions include Janitor AI, CrushOn AI, Mello AI, and Momo Self-Care.
  • Breaking updates. Any app, any tier. If there’s a safety incident, regulatory action (such as the Replika ERP controversy), major pricing change, or app shutdown, we update within 48 hours.

Every review and safety rating page displays a “Last Reviewed” date above the fold. You should always be able to see when we last checked the information you’re reading. If an app’s safety score changes, we show the previous value and explain what triggered the update.

What Makes Our Approach Different?

We studied how other AI companion review sites work before building ours. Most use 8 to 15 scoring criteria weighted heavily toward features: chat quality, image generation, voice quality, customization options. Privacy might get 10% of the total weight. Safety, if it appears at all, is a single line item buried at the bottom of a score chart.

CompanionWise does three things differently:

  1. Safety is a separate system, not a line item. Our Safety Index uses 23 sub-dimensions. It’s scored independently. It has its own published methodology. It produces a standalone safety rating page for every app we review.
  2. We publish our evidence standards. Our evidence hierarchy spells out exactly what counts as publishable evidence and what doesn’t. No other AI companion review site we’ve found does this.
  3. Our editorial rules have teeth. We don’t just say “affiliate links don’t affect scores.” We publish hard thresholds: apps below 38 safety can’t buy placements. Apps below 50 can’t get badges. These rules are in our editorial policy, not buried in a footnote.

Is this more work than running a feature-comparison site? Absolutely. But you’re trusting our recommendations for apps that will read your private conversations and charge your credit card. The review process behind those recommendations should hold up to the same scrutiny.

Frequently Asked Questions

Do you test every app yourself?

Our current methodology (v1) is evidence-based rather than behavioral. For every app, we read the full privacy policy and terms of service, analyze app store review patterns, search for regulatory actions and fines, and review official safety documentation. We score each app’s 23 sub-dimensions based on this documented evidence using our standardized AI-assisted methodology, with every score reviewed and approved by a human editor. Direct behavioral testing (creating accounts, stress-testing crisis response, attempting to bypass filters) is planned for future methodology versions. Our evidence standards describe the four tiers of evidence we accept before publishing any claim.

Can an app pay for a higher score?

No. Safety ratings are finalized before any commercial discussion with an app’s developer. Sponsored placements are clearly labeled and never appear on safety rating pages. Apps with a Safety Index score below 38 out of 100 can’t purchase any sponsored placement on CompanionWise. Full details are in our editorial independence policy.

How often are reviews updated?

Our top five apps by traffic are checked monthly. Apps ranked 6 through 15 are reviewed quarterly. Smaller apps are reviewed semi-annually. Breaking events like safety incidents, regulatory actions, or major pricing changes trigger updates within 48 hours, regardless of tier. Every page shows a “Last Reviewed” date so you know when we last verified the information.

Where can I see your full safety methodology?

Our complete safety scoring framework, including all six public dimensions and how sub-dimensions are weighted, is published on our How We Rate page. The evidence gathering methodology is covered in our Evidence Standards. Our editorial rules are outlined in our Editorial Policy.


Every review on CompanionWise follows this same process. If you’re going to trust a website’s recommendation for an app that handles your private conversations, you should be able to see exactly how that recommendation was made.

Want to see the process in action? Read one of our reviews, or explore our full ranking in Best AI Companion Apps 2026. You can also dig into our safety rating methodology to understand how we score the six safety dimensions. If you have questions about our process that aren’t answered here, get in touch. Our CrushOn AI review and Sakura FM review cover additional apps in the index.