[Drupal] How to create and run a php script to remove year from title and store it seperately?

| | 6 min read

In this article I will be explaining a task I did for one of our sites in Drupal6. In this site, for every content type say event the users manually entered year along with the title in title field. So my task was to remove the year from title and save it in a seperate cck field.

Here since the year is manually entered in title field, different users entered in different format. So first manually checked a couple of nodes to found out how different users entered the year. After a few rounds of check I found out following cases.

  • Steptoe and Son - UK Tour - 2012/2013
  • Steptoe and Son - UK Tour - 2012/13
  • Henry Moore and Auguste Rodin: Moore Rodin - Henry Moore Foundation - 2013
  • In between Title year comes like "Imagined Lives: Mystery Portraits 1520-1640 - National Portrait Gallery"
  • The Family Jewels - Marina and the Diamonds(with no year)

For each case need to the following

  • Steptoe and Son - UK Tour - 2012/2013 - Save year as 2012 and title should be "Steptoe and Son - UK Tour"
  • Steptoe and Son - UK Tour - 2012/13 - Save year as 2012 and title should be "Steptoe and Son - UK Tour"
  • Henry Moore and Auguste Rodin: Moore Rodin - Henry Moore Foundation - 2013 - Save year as 2013 and title should be "Henry Moore and Auguste Rodin: Moore Rodin - Henry Moore Foundation"
  • If year comes between title like "Imagined Lives: Mystery Portraits 1520-1640 - National Portrait Gallery" - Get year from created date and title should be "Imagined Lives: Mystery Portraits 1520-1640 - National Portrait Gallery"
  • The Family Jewels - Marina and the Diamonds(with no year) - Get year from created date and title should be"The Family Jewels - Marina and the Diamonds"

At first for saving the year, I added a new cck field Year to the corresponding content type "event".

  • Label -> Year
  • Field name -> field_event_year
  • Type of data to store. -> Select a field type -> Date
  • Form element to edit the data. -> Select a widget type -> Text Field with custom input format

Once the followings fields are entered click save.This will take you to field configuration page. In this page add the following settings.

  • DEFAULT VALUE -> Now
  • CUSTOM INPUT FORMAT -> 'Y'
  • Type of data to store. -> Select a field type -> Date
  • NUMBER OF VALUES -> 1
  • TO DATE -> Never
  • GRANULARITY -> Year
  • TIME ZONE HANDLING -> No time zone conversion

Now we have added the required fields and we know what to do in each case.

Next question comes to your mind where to write this script. Goto location where index.php file is saved and make a copy of index.php file rename it accordingly. Here I named it as "rm_year_frm_title.php".

In this file remove all the lines except the following two lines

require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

Now can call this file as "url/rm_year_frm_title.php".

Next part will be to write a query to get all the nodes comes under content type "event".

Few points to remember while writing this query are

  • First need to join two tables because title comes under table node and year comes under table content_type_event.
  • Now when you join this table codition should be nid of both tables should be same
  • Now when select the values it should satisfy following conditions
  1. Content Type should be event
  2. vid of both tables should be same(to make sure that in the script it takes values of last updated content)
  3. Only select the values of the field if the field event year is empty
  4. Also remember to set a limit so that once we run the script we could check whether the output is correct

Now comes our cases. In this I checked each case using regular expressions. I get the year and title using function preg_match and preg_replace. by seeing the code you can understand where each conditions are called

Now when you store value to year field keep in mind that when we created the cck field year we saved it as type Date, so when you save the year programaticaly it should be saved in following format "YYYY-MM-DDTHOUR:MIN:SEC".

Now you refer the code. To run the code save first save the file as "rm_year_frm_title.php" and saved it in location where index.php is saved. To run the script as "url/rm_year_frm_title.php".

<?php

/**
 * @file
 * The PHP page that serves all page requests on a Drupal installation.
 *
 * The routines here dispatch control to the appropriate handler, which then
 * prints the appropriate page.
 *
 * All Drupal code is released under the GNU General Public License.
 * See COPYRIGHT.txt and LICENSE.txt.
 */

require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

$result = db_query("SELECT  n.nid AS nid,n.vid AS vid,n.title AS title,n.created AS created,c.nid AS cnid,c.vid AS cvid,c.field_event_year_value AS field_event_year_value,  n.type AS type  FROM  {node} n LEFT JOIN  {content_type_event} c ON  c.nid = n.nid  WHERE  type IN ('event') AND c.vid = n.vid AND field_event_year_value  IS NULL Limit 5");
  while ($resultset = db_fetch_object($result)) {
    /*Title - 2012/2013 */
    $pattern = "(\d{4}/\d{4})";
    preg_match($pattern, $resultset->title, $yearfull);
    if (isset($yearfull[0])) {
      $nodeload=node_load($resultset->nid);
      preg_match('(\d{4})', $yearfull[0], $year);
      $patterns='((( ?-|,) ?)?\d{4}/\d{4})';
      $replacements=' ';
      $string=$resultset->title;
      $nodeload->title= preg_replace($patterns, $replacements, $string);
      $nodeload->field_event_year[0]['value']=$year[0] . '-00-00T00:00:00';
      node_save($nodeload);
      print $resultset->nid;
      print "\n";
    }
    else {
      /*Title - 2012/13 */
      $pattern = "(\d{4}/\d{2})";
      preg_match($pattern, $resultset->title, $yearfull);
      if (isset($yearfull[0])) {
        $nodeload=node_load($resultset->nid);
        preg_match('(\d{4})', $yearfull[0], $year);
        $patterns='((( ?-|,) ?)?\d{4}/\d{2})';
        $replacements=' ';
        $string=$resultset->title;
        $nodeload->title= preg_replace($patterns, $replacements, $string);
        $nodeload->field_event_year[0]['value']=$year[0] . '-00-00T00:00:00';
        node_save($nodeload);
        print "\n";
        print $resultset->nid;
        print "\n";
      }
      else {
        /*Title - 2012 */
        ///$pattern = "((( ?-|,) ?)?\d{4})";
        $pattern = "(\d{4}$)";
        $title = $resultset->title;
        $title=trim($title,"  ");
        preg_match($pattern, $title, $yearfull);
        if (isset($yearfull[0])) {
          preg_match('(\d{4})', $yearfull[0], $year);
          $patterns = "((( ?-|,) ?)?\d{4}$)";
          $nodeload=node_load($resultset->nid);
          $replacements=' ';
          $string=$title;
          $nodeload->title= preg_replace($patterns, $replacements, $string);
          $nodeload->field_event_year[0]['value']=$year[0] . '-00-00T00:00:00';
          node_save($nodeload);
          print "\n";
          print $resultset->nid;
          print "\n";
        }
        else {
          /*Title*/
          $nodeload=node_load($resultset->nid);
          $year= date('Y', $resultset->created);
          $nodeload->field_event_year[0]['value']=$year . '-00-00T00:00:00';
          node_save($nodeload);
          print "\n";
          print $resultset->nid;
          print "\n";
        }
      }
    }
  }

To check the status of your script you can run the following script as before, named as "count_remain.php".

<?php

/**
 * @file
 * The PHP page that serves all page requests on a Drupal installation.
 *
 * The routines here dispatch control to the appropriate handler, which then
 * prints the appropriate page.
 *
 * All Drupal code is released under the GNU General Public License.
 * See COPYRIGHT.txt and LICENSE.txt.
 */

require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

$result = db_query("SELECT  count(n.nid) AS nid FROM {node} n LEFT JOIN {content_type_event} c ON c.nid = n.nid  WHERE type IN ('event') AND field_event_year_value IS NOT NULL");
  while ($resultset = db_fetch_object($result)) {
    print "\n";
    print "Count of Updated Nodes -- ".$resultset->nid;
    print "\n";
  }

$remaining = db_query("SELECT count(n.nid) AS nid FROM {node} n LEFT JOIN {content_type_event} c ON c.nid = n.nid WHERE type IN ('event') AND field_event_year_value IS NULL");
  while ($remainingset = db_fetch_object($remaining)) {
    print "\n";
    print "Count of Remaining Nodes -- ".$remainingset->nid;
    print "\n";
  }

$total = db_query("SELECT  count(n.nid) AS nid FROM {node} n LEFT JOIN {content_type_event} c ON  c.nid = n.nid  WHERE type IN ('event')");
  while ($totalset = db_fetch_object($total)) {
    print "\n";
    print "Count of Total Nodes -- ".$totalset->nid;
    print "\n";
  }

If the count of node is higher its not possible to run the script in one go. In-order to achiew this we have to set a cron job. To know how to set and run a cron job for our script "rm_year_frm_title.php" in Linux OS you can view the following articles. Add Jobs To cron Under Linux or UNIX?