Friday, 29 May 2026

Mastering Identity Resolution in Sitecore CDP: Anonymous to Known Visitors

 


There is a moment in almost every CDP implementation where the team sits around and asks the same question: how do we know this anonymous visitor is the same person who logged in an hour later from their phone? It sounds simple. In practice, it is one of the more complex problems you will face in a real Sitecore CDP project.

Identity resolution is the process of stitching together all the signals a visitor leaves across sessions, devices, and channels — and associating those signals with a single, unified profile. Get this right, and your personalization becomes sharp and meaningful. Get it wrong, and you end up with fragmented data, duplicate profiles, and personalization logic that fires at the wrong people at the wrong time.

Sitecore CDP has a built-in identity resolution engine, and it is quite capable. But it is not magic. It depends on how you feed it data, in what order, and with what identifiers. Understanding the internals of how it works — and where it breaks — is what separates a solid implementation from a fragile one that starts showing cracks six months post-launch.

This post covers everything an architect or senior developer needs to know about identity resolution in Sitecore CDP. We will go through the mechanics, the event sequencing, the merge strategies, cross-device challenges, privacy implications, and what to do when things go wrong.

How Sitecore CDP Identifies a Visitor

When someone visits your site for the first time, CDP knows almost nothing about them. It knows a browser made a request. That is all. From that point, it starts building a picture using a few key mechanisms.

The browser identifier is the starting point — a first-party cookie set by CDP's JavaScript library on the first page load. This cookie carries a guest reference, which is CDP's internal handle for that anonymous session. Every event that flows in from that browser — page views, product clicks, form interactions — gets tagged with this guest reference and appended to the anonymous profile.

The customer identifier enters the picture when the visitor does something that reveals who they are: logging in, submitting a form with their email, completing a checkout. At that point, your application should fire an IDENTITY event to CDP's Cloud SDK API, including both the guest reference and the real customer identifier. That event is what triggers the merge.



CDP Identifier Types at a Glance:

  Browser Identifier   — Cookie set by CDP JS library on first visit

  Guest Reference      — Internal CDP handle for anonymous sessions

  Customer Identifier  — Real-world ID (email, CRM ID, loyalty number)

  Email Address        — Often used as the primary merge key

  Phone Number         — Secondary identifier in some configurations

  Custom Identifiers   — Any brand-specific ID passed via event payload



The Identity Resolution Algorithm

How Merging Actually Works

Identity resolution is the process of merging an anonymous profile with a known profile. In Sitecore CDP, this happens when the platform receives an event that includes both a guest reference (from the existing anonymous cookie) and a customer identifier (from a login or form submission).

The moment CDP receives that event, it looks up whether a profile already exists for that customer identifier. If one exists, the anonymous guest reference is merged into the known profile. If no existing known profile is found, CDP promotes the current anonymous profile to a known profile by attaching the customer identifier to it.

Here is what the merge process does, roughly speaking. CDP takes the behavioral history from the anonymous profile — sessions, events, page views, goals — and combines it with any data already stored in the known profile. The known profile's customer data (first name, last name, email, attributes from previous interactions) takes precedence in conflict situations, but the event history is additive. You do not lose events from either side.



Profile Reconciliation Logic

One question that comes up a lot is: what happens to the anonymous profile after the merge? In Sitecore CDP, the anonymous guest reference effectively becomes associated with the known profile. Future events from the same browser (using the same anonymous cookie) are automatically attributed to the known profile, even without sending the customer identifier again in every event.

This is an important architectural point. Once a browser has been merged with a known profile, CDP remembers that association. The cookie-to-profile mapping is persisted. So a return visit from the same browser — even without login — will route events to the known profile, not create a new anonymous one.

However, there are edge cases. If the visitor clears cookies, uses a private browsing session, or switches devices, CDP loses that browser-to-profile link. The visitor starts fresh as anonymous again — until they identify themselves once more. This is why cross-device tracking is handled separately, which we will cover shortly.

Handling Conflicts During Merge

Conflict handling is an area where many teams do not think carefully until they hit a problem in production. The most common conflict scenario is when two anonymous profiles need to merge because the same person used two different browsers before logging in.

For example: a visitor browses on Chrome at home, then the next day browses on Firefox at work, and logs in on Firefox. CDP will merge the Firefox anonymous profile with the known profile. But the Chrome anonymous profile is still floating separately, because CDP had no way of knowing they were the same person before login.

Eventually, if the person logs in on Chrome too, that anonymous profile will also get merged. CDP handles this gracefully — it adds the Chrome behavioral history to the existing known profile. But there is a window of time where some behavioral data sits in a separate anonymous profile until the second merge happens.

Event Sequencing and Data Flow

The Full Journey from Anonymous to Known

Understanding the event sequence is critical for implementation. A lot of issues in CDP projects trace back to events being fired in the wrong order, or missing events that should have triggered the identity merge. Here is the typical flow:

 STEP 1 │ First Visit (Anonymous)                                              

        │ CDP JS library loads → sc_anonymous_id cookie set                   

        │ VIEW event fires → anonymous profile created in CDP                  

        │ Guest reference assigned                                                                                                              

 STEP 2 │ Continued Browsing                                                   

        │ More VIEW events fire → behavioral data accumulates                  

        │ All events tagged with same guest reference                                                                                             

 STEP 3 │ Login / Form Submission                                              

        │ User logs in → your app fires IDENTITY event to CDP Cloud SDK API      

        │ IDENTITY event payload includes: guest reference + customer ID                                                                      

 STEP 4 │ Identity Resolution                                                   

        │ CDP checks for existing profile with that customer ID               

        │ If found → merge anonymous history into known profile               

        │ If not found → promote anonymous profile to known                                                                                     

 STEP 5 │ Post-Merge                                                           

        │ All future events from same browser → routed to known profile       

        │ Personalization rules activate based on enriched profile            

        │ Audience segments recalculated in near real time    



Merge Strategies in Real Projects

Sitecore CDP uses deterministic identity matching. Merges happen only when a concrete, known identifier is provided — not based on inferred signals like IP address or browser fingerprint. In most enterprise implementations, this is the right approach. The key is capturing that identifier consistently across every meaningful touchpoint: login, checkout, newsletter signup, account update.

In practice, teams use a few common patterns:
Email-first strategy: Email address is the primary merge key. Everything else (CRM ID, loyalty ID) is treated as supplementary. This works well when email capture is consistent across channels but can cause issues if users have multiple email addresses.
CRM ID strategy: A stable, platform-generated identifier from your CRM or commerce system is the primary key. This is more durable than email (users change emails more often than you might think) and generally recommended for mature implementations.
Multi-identifier strategy: You send multiple identifiers in the IDENTITY event — email , phone or CRM ID, for example — and configure CDP to use a specific one as the primary. This is the most robust approach but requires clear data governance to avoid mismatches.




Duplicate profiles are a reality in any CDP project. They accumulate when the same person enters through different channels before identifying themselves, or uses different email addresses on different touchpoints. Cleanup usually happens via CDP's Batch API — you run a batch of IDENTITY events that link duplicate guest references to the canonical customer identifier, effectively chaining the merges. Most teams need to do at least one cleanup pass post-launch, especially after adding a new integrated channel.

Cross-Device Tracking and Privacy Constraints

Cross-Device Challenges

The only reliable way to link sessions across devices in Sitecore CDP is through authenticated identification. When a user logs in on their phone, the IDENTITY event associates that mobile browser's guest reference with the known profile. When they log in on their laptop, the same thing happens. Over time, CDP builds a unified view.
But the two devices are not connected in real time. If the laptop session never included a login, that behavioral history sits in a separate anonymous profile until the user authenticates on that device too. Cross-device unification is an ongoing process, not a one-time event.
Private browsing is a category of its own. When a visitor uses incognito mode, cookies do not persist across sessions. Every new private window is a fresh anonymous visitor from CDP's perspective. There is no clean technical solution within CDP's native capabilities for this — and trying to engineer around it creates both complexity and privacy risk. Accept it as a limitation and focus your energy on the authenticated journey.

Cookie Consent and GDPR Impact

Consent management has direct functional implications for your tracking layer, not just your legal documentation. CDP relies on its first-party cookie to maintain the anonymous profile across sessions. If a visitor declines tracking consent, that cookie should not be set — which means CDP cannot build a persistent anonymous profile for that visitor.
You need to conditionally initialize CDP's JavaScript library based on the user's consent state. If your CMP signals that analytics or targeting cookies are accepted, CDP initializes with full tracking. If consent is declined, CDP either does not load or operates in a cookieless mode where session continuity is limited. This integration needs to be built and tested from the start — retrofitting consent handling late in a project is painful.

// example — conditional CDP initialization
if (consentManager.hasConsent('analytics')) {
  initializeSitecoreCDP({ guestContextId: getGuestRef() });
} else {
  // CDP loads without persistent cookie tracking
  initializeSitecoreCDP({ cookieless: true });
}

When tracking consent is denied, you can still deliver contextual personalization based on non-personal signals: the current page, URL parameters, campaign attribution, device type, and country-level geography. It is less powerful than profile-driven personalization, but it is compliant and still adds value.


When Identity Resolution Fails

Production always surfaces edge cases that testing does not. Here are the failure scenarios architects encounter most often, and how to handle them.
  • Safari's Intelligent Tracking Prevention (ITP) aggressively restricts third-party cookies and increasingly limits some first-party cookie lifetimes. Even if you are using first-party cookies, ITP may cap their expiry at 7 days in certain configurations. This means a user who visited two weeks ago will appear as a new anonymous visitor, even if they previously identified themselves. The mitigation is to set your CDP cookie as an HTTP-only, server-side cookie rather than a JavaScript-set cookie
  • Ad blockers frequently block requests to CDP's Cloud API endpoint. Client-side event sending is vulnerable to this. For high-value interactions — purchase completions, form submissions — implement server-side event sending via CDP's Batch API as a fallback. Server-to-server calls bypass client-side blocking entirely.
  • Missing or late IDENTITY events are probably the most common implementation bug. This usually happens in SPAs where the login interaction does not trigger the correct event, or where the IDENTITY event fires but with an incomplete payload. Thorough end-to-end testing of the login flow — checking CDP's event stream directly — is the only reliable way to catch this.
  • When identity resolution cannot complete, experiences should degrade gracefully. Every Sitecore Personalize experience should have a sensible default variant that applies when the visitor is unknown or the profile is incomplete. Experiences designed only for fully-known visitors will misfire on a meaningful percentage of real-world sessions.
  • Troubleshooting Approach When identity resolution is not working as expected, the debugging process typically follows this order: First, verify in CDP's event stream that events are being received with the correct guest reference and customer identifier. Second, check the order of events — is the IDENTITY event firing before other post-login events? Third, look at the profile in CDP's Guest Profile Viewer and check whether the known customer identifier is attached. Fourth, inspect the browser's cookie storage to confirm the anonymous cookie is being set and persisted correctly.

Architecture Best Practices

Data Governance Starts at the Identifier Level: 

Before writing a single line of CDP integration code, your team needs to answer one question: what is your canonical customer identifier? This sounds obvious, but in large enterprises with multiple systems — a CRM, a loyalty platform, an e-commerce engine, a mobile app backend — there are often competing identifiers. Your CDP implementation will only be as good as the consistency of the identifiers flowing into it. Resolve this at the architecture stage — not during development.

Event Naming Standards

Define your event taxonomy upfront and treat it as a versioned artifact. CDP is an event-driven platform. The quality of your behavioral data depends entirely on the consistency and clarity of your event taxonomy. Use a standardized naming convention across all channels — web, mobile app, email, contact center. If your web team calls the login event IDENTITY but your mobile team calls it USER_LOGIN with different payload structures, you end up with data that is hard to reconcile.

// ---------------------- CDP EVENT ----------------------
        try {
          await event({
            type: 'CONTACT_DETAILS_FORM_SUBMITTED',
            channel: 'WEB',
            currency: 'USD',
            page: route?.name,
            pageVariantId,
            language,
            payload: {
              Name: formData.firstName + ' ' + (formData.middleName || '') + ' ' + formData.surname,
              Email: formData.email,
              Home: formData.home ? formData.home : '',
              Work: formData.work ? formData.work : '',
            },
          });
        } catch (err) {
          console.error('Error sending event to CDP:', err);
        }

Identifier Strategy

Send multiple identifiers whenever possible — email AND CRM ID, for example — but nominate one as the primary merge key. Configure CDP to use that primary key for identity resolution. The secondary identifiers become attributes on the profile, useful for cross-system lookups but not the basis for merging.

try {
      const { identity } = await import("@sitecore-cloudsdk/events/browser");

      const eventData = {
        channel:  "WEB",
        language: "EN",
        currency: "USD",
        // ← PRIMARY MERGE KEY — provider must match CDP identity rule name
        identifiers: [{ provider: "CRM_ID", id: crmId.trim() }],
        // ← SUPPLEMENTARY PII
        ...(email     && { email:     email.trim()     }),
        ...(firstName && { firstName: firstName.trim() }),
        ...(lastName  && { lastName:  lastName.trim()  }),
        ...(phone     && { phone:     phone.trim()     })
      };

      const extensionData = {
        source: "react-app",
        identityMethod: "CRM_ID",
      };

      log(`Calling identity() — payload: ${JSON.stringify(eventData)}`, "sdk");

      await identity(eventData, extensionData);

      log("IDENTITY event accepted — CDP running linking algorithm", "success");
      log("Guest type: VISITOR → CUSTOMER", "success");

      setProfileData({
        crmId: crmId.trim(),
        ...(email     && { email }),
        ...(firstName && { firstName }),
        ...(lastName  && { lastName }),
        ...(phone     && { phone }),
        guestType:  "customer",
        resolvedAt: new Date().toISOString(),
        sdk:        "@sitecore-cloudsdk/events/browser",
      });
      setStatus("success");
      setActiveTab("profile");

    } catch (err) {
      log(`IDENTITY event failed: ${err.message}`, "error");
      setStatus("error");
    }

Profile Enrichment Considerations

Identity resolution is the foundation, but it is not the end goal. Once you have a unified profile, the next step is enriching it — pulling in attributes from CRM, purchase history from commerce, preference data from loyalty programs. In Sitecore CDP, this happens through the Batch API and stream API ingestion.

API Considerations

CDP has two main APIs for server-side integration: the Cloud SDK API (real-time, synchronous) and the Batch API (asynchronous, bulk). For identity resolution, the Cloud SDK API is the right tool — it processes events in real time and triggers immediate profile merges. The Batch API is better suited for bulk historical data imports and profile attribute updates. Always implement retry logic for Cloud SDK API calls. 

Cloud SDK initialization code-

import { useEffect, JSX } from 'react';
import { SitecorePageProps } from 'lib/page-props';
import { CloudSDK } from '@sitecore-cloudsdk/core/browser';
import '@sitecore-cloudsdk/events/browser';
import '@sitecore-cloudsdk/personalize/browser';
import config from 'temp/config';
import { LayoutServicePageState, RenderingType } from '@sitecore-jss/sitecore-jss-nextjs';

/**
 * The Bootstrap component is the entry point for performing any initialization logic
 * that needs to happen early in the application's lifecycle.
 */
const Bootstrap = (props: SitecorePageProps): JSX.Element | null => {
  // Browser ClientSDK init allows for page view events to be tracked
  useEffect(() => {
    const pageState = props.layoutData?.sitecore?.context.pageState;
    const renderingType = props.layoutData?.sitecore?.context.renderingType;
    console.log('environment', process.env.NODE_ENV);
   
    // Skip initialization in edit and preview modes only
    if (
      pageState !== LayoutServicePageState.Normal ||
      renderingType === RenderingType.Component
    ) {
      console.debug('Browser Events SDK is not initialized in edit and preview modes');
      return;
    }

    // Initialize Cloud SDK in both development and production environments
    try {
      console.debug('Initializing Browser Events SDK');
      CloudSDK({
        sitecoreEdgeUrl: config.sitecoreEdgeUrl,
        sitecoreEdgeContextId: process.env.SITECORE_EDGE_CONTEXT_ID || '7k2RMDcj04zEcytgGeS5Zq',
        siteName: props.site?.name || config.sitecoreSiteName,
        enableBrowserCookie: true,
        // Replace with the top level cookie domain of the website that is being integrated e.g ".example.com" and not "www.example.com"
        cookieDomain: process.env.NEXT_PUBLIC_COOKIE_DOMAIN,
      })
        .addEvents()
        .addPersonalize({ enablePersonalizeCookie: true, webPersonalization: true }) // Initialize the personalize package
        .initialize();
      console.debug('Browser Events SDK initialized successfully');
    } catch (error) {
      console.warn('Cloud SDK initialization failed or already initialized:', error);
    }
    // eslint-disable-next-line react-hooks/exhaustive-deps
  }, [props.site?.name]);

  return null;
};

export default Bootstrap;

Debugging Approaches

For Debugging, maintain a dedicated test profile or set of test profiles in your CDP tenant for integration testing. Use a known, synthetic customer identifier (like test-user-001@yourcompany.com) that you can trace through the entire event sequence. This makes it much easier to verify that identity resolution is working correctly without polluting your production profile data.
Also consider implementing a lightweight event logging middleware that captures every payload sent to CDP before it goes out. Storing these logs (even temporarily) gives you a timeline of exactly what was sent, in what order, which is invaluable for debugging timing and sequencing issues.

What Teams Should Avoid

Avoid sending the IDENTITY event only at one point in the journey (like purchase confirmation) when you could be sending it at login, form submission, and other earlier touchpoints. The earlier you capture identity, the more pre-login behavioral data gets stitched into the known profile.
Do not assume that CDP will handle everything automatically. It is a sophisticated platform, but it is not self-configuring. The quality of your identity resolution is a direct function of the quality of the events you send to it — correct identifiers, correct timing, correct sequencing.
Avoid building personalization experiences that are brittle when identity resolution has not yet completed. Always design for the case where the profile is incomplete, anonymous, or partially merged. Experiences that assume a fully known profile will misfire in a significant percentage of real-world sessions.
Finally, do not skip the post-launch monitoring phase. Identity resolution issues often do not surface in testing because test scenarios are too clean and controlled. In production, you encounter edge cases — users with corrupted cookies, users who logged in through SSO without the IDENTITY event firing correctly, users from consent-restricted geographies. Set up monitoring for profile fragmentation and duplicate detection from day one.

Conclusion

Identity resolution is not a feature you configure once and forget. It is an ongoing architectural concern that touches your data governance, event strategy, consent management, and front-end integration all at once.
Sitecore CDP gives you a capable foundation. The platform's ability to merge anonymous behavioral history with known profiles in near real time is genuinely useful when the data feeding it is clean and correctly sequenced. The implementations that struggle are almost always struggling because of data quality or event sequencing problems — not platform limitations.
The teams that get this right are the ones that treat identifier strategy and event taxonomy as first-class design decisions, not implementation details. If you invest that thinking upfront, the rest of the CDP implementation tends to follow reasonably well.

References 




No comments:

Post a Comment